Triplex

The Triplex model by SciPhi is an advanced tool for knowledge graph construction. It excels in extracting triplets (subject, predicate, object) from unstructured data, significantly reducing costs and improving performance compared to traditional models like GPT-4.

Key Features of Triplex

Cost Efficiency

Offers a 98% reduction in costs for creating knowledge graphs, outperforming models like GPT-4 at a fraction of the cost.

High Performance

Trained on diverse datasets, ensuring robustness and versatility across various applications.

Open Source

Available on platforms like Hugging Face, making it accessible for developers and researchers.

Advanced Training Techniques

Utilizes Dynamic Programming Optimization (DPO) and Knowledge Triplet Optimization (KTO) for improved accuracy and efficiency.

Download and Install Triplex

Step 1: Install the Required Packages

Run the following command to install the necessary libraries:

pip install transformers torch

Step 2: Clone the R2R Repository

Clone the repository from GitHub and navigate to the directory:

git clone https://github.com/SciPhi-AI/R2R.git
cd R2R

Step 3: Install R2R

Install R2R using pip:

pip install r2r

Or use Docker for a more streamlined setup:

r2r --config-name=default serve --docker

How to Use Triplex

Loading the Model and Tokenizer

Use the following code to load the model and tokenizer:


from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("SciPhi/Triplex")
tokenizer = AutoTokenizer.from_pretrained("SciPhi/Triplex")

Extracting Triplets

Define and use a function to extract triplets from text:


import json
def triplextract(model, tokenizer, text, entity_types, predicates):
input_format = """Perform Named Entity Recognition (NER) and extract knowledge graph triplets from the text.
NER identifies named entities of given entity types, and triple extraction identifies relationships
between entities using specified predicates.
**Entity Types:**
{entity_types}
**Predicates:**
{predicates}
**Text:**
{text}
"""
message = input_format.format(entity_types=json.dumps({"entity_types": entity_types}), predicates=json.dumps({"predicates": predicates}), text=text)
messages = [{'role': 'user', 'content': message}] input_ids = tokenizer(messages, return_tensors="pt").input_ids
output = model.generate(input_ids=input_ids, max_length=2048)
return tokenizer.decode(output[0], skip_special_tokens=True)

Example Usage

Extract triplets from a sample text:


text = "Paris is the capital of France."
entity_types = ["CITY", "COUNTRY"] predicates = ["CAPITAL_OF", "LOCATED_IN"] triplets = triplextract(model, tokenizer, text, entity_types, predicates)
print(triplets)

Additional Tips for Triplex

Optimizing Performance

  • Use a temperature setting of 0.3 for optimal results.
  • Ensure your hardware meets the requirements for running large models efficiently.

Triplex is designed to be a robust and flexible model suitable for a wide range of applications. Whether you need advanced reasoning, multilingual support, or efficient text compression, this model provides a comprehensive solution. By following the installation and usage guidelines, you can harness the full potential of Triplex for your projects.

Leave a Comment