Model Card for fvossel/Llama-3.1-8B-Instruct-nl-to-fol

This model contains LoRA adapter weights for the base model meta-llama/Llama-3.1-8B-Instruct. It was trained to translate natural language statements into First-Order Logic (FOL) representations.

Model Details

Model Description

Developed by: Vossel et al. at Osnabrück University
Funded by: Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) 456666331
Model type: Decoder-only causal language model (LLaMA architecture)
Language(s) (NLP): Englisch, FOL
License: This repository contains only LoRA adapter weights, trained using the base model meta-llama/Llama-3.1-8B-Instruct, which is released under the Meta Llama 3 Community License. These adapter weights are released under the Apache 2.0 License.
Finetuned from model: meta-llama/Llama-3.1-8B-Instruct

Uses

Direct Use

This model is designed to translate natural language (NL) sentences into corresponding first-order logic (FOL) expressions. It can be used directly for:

Automated semantic parsing and formalization of NL statements into symbolic logic.
Supporting explainable AI systems that require symbolic reasoning based on language input.
Research in neurosymbolic AI, logic-based natural language understanding, and formal verification.
Integrating into larger pipelines for natural language inference, question answering, or knowledge base population.

Users should be aware that the model outputs symbolic formulas which may require further validation or post-processing depending on the application.

Downstream Use

This LoRA adapter can be further fine-tuned or combined with other models for specialized domains, e.g., legal, biomedical, or scientific text formalization. It is also suitable for integration into interactive systems that require formal reasoning capabilities.

Out-of-Scope Use

This model is not intended for general-purpose natural language generation.
It may not perform well on ambiguous, highly figurative, or out-of-domain sentences.
Users should not rely on the model output as final legal, medical, or safety-critical logic without expert review.

Recommendations

The model may produce logically incorrect or syntactically invalid formulas, especially on complex or ambiguous input. Users should validate outputs before use in critical applications.
As the training data contains inherent biases from the source datasets and GPT-4 generation, users should be cautious about potential biases reflected in the model’s predictions.
The model is specialized for English natural language and first-order logic; it may not generalize well to other languages or more expressive logics.
Users applying this model in downstream tasks should carefully evaluate its performance and consider human-in-the-loop validation to ensure correctness.
This model is intended for research and prototyping purposes and should not be used as a standalone solution in high-stakes decision making without expert oversight.

How to Get Started with the Model

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Model repository identifiers
base_model_name = "meta-llama/Llama-3.1-8B-Instruct"
lora_weights = "fvossel/Llama-3.1-8B-Instruct-nl-to-fol"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

# Load the base model
model = AutoModelForCausalLM.from_pretrained(base_model_name, trust_remote_code=True, device_map="auto")

# Load the LoRA adapter weights
model = PeftModel.from_pretrained(model, lora_weights, device_map="auto")

# Define formatting function for preparing the input prompt
def formatting_func(text):
    return tokenizer.apply_chat_template(
        [
            {
                "role": "system",
                "content": (
                    "You are a helpful AI assistant that translates Natural Language (NL) text "
                    "into First-Order Logic (FOL) using only the given quantors and junctors: "
                    "∀ (for all), ∃ (there exists), ¬ (not), ∧ (and), ∨ (or), → (implies), "
                    "↔ (if and only if), ⊕ (xor). "
                    "Start your answer with '𝜙=' followed by the FOL-formula. Do not include any other text."
                ),
            },
            {"role": "user", "content": text},
        ],
        tokenize=False,
        add_generation_prompt=False,
    )

# Example NL input
input_text = "All dogs are animals."

# Use formatting_func to create the model input
prompt = formatting_func(input_text)

# Tokenize the prompt
inputs = tokenizer(prompt, return_tensors="pt", padding=True)

# Generate output
outputs = model.generate(**inputs, max_new_tokens=100)

# Decode and print the result
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on the groves dataset.

Training Procedure

The model was fine-tuned using LoRA adapters on top of the pre-trained "meta-llama/Llama-3.1-8B-Instruct" model. The training involved:

Prompt-based instruction tuning

Training Hyperparameters

Training regime: bf16 mixed precision
Batch size: 8 (per device)
Learning rate: 1e-5
Number of epochs: 12
Optimizer: AdamW
Scheduler: Cosine learning rate scheduler
Warmup ratio: 0.05
Gradient accumulation steps: 2
Weight decay: 0.01
LoRA rank (r): 16
LoRA alpha: 32
LoRA dropout: 0.05
Target modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
Bias: none
Task type: CAUSAL_LM
Early stopping patience: 4 epochs

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for fvossel/Llama-3.1-8B-Instruct-groves

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Finetuned

(2771)

this model

Dataset used to train fvossel/Llama-3.1-8B-Instruct-groves

Collection including fvossel/Llama-3.1-8B-Instruct-groves

NL-TO-FOl

Collection

Models fine-tuned for NL-FOL translation • 40 items • Updated Apr 24