YAML Metadata Warning: The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

T5-LM-Large text2sql-spider — ONNX

ONNX export of T5-LM-Large-text2sql-spider (0.8B parameters) with encoder-decoder architecture and KV cache support.

This is a T5-large model fine-tuned on the Spider and Spider-Syn datasets for text-to-SQL generation. Given a natural language question and a database schema, it produces the corresponding SQL query.

Converted for use with inference4j, an inference-only AI library for Java.

Original Source

Usage with inference4j

try (var sqlGen = T5SqlGenerator.t5LargeSpider().build()) {
    String sql = sqlGen.generateSql(
        "How many employees are in each department?",
        "\"employees\" \"id\" int, \"name\" varchar, \"dept_id\" int "
        + "[SEP] \"departments\" \"id\" int, \"name\" varchar");
    System.out.println(sql);
}

Schema Format

The model expects the schema in the following format:

"table_name" "col1" type, "col2" type, foreign_key: "table"."col" = "other"."col" primary key: "col" [SEP] "table2" ...
  • Table and column names are double-quoted
  • Columns are comma-separated with types
  • Tables are separated by [SEP]
  • Foreign keys and primary keys are declared per table

Model Details

Property Value
Architecture T5 encoder-decoder (0.8B parameters)
Task Text-to-SQL generation
Training data Spider, Spider-Syn
Tokenizer SentencePiece (32,128 tokens)
Original framework PyTorch (transformers)
Export method Hugging Face Optimum (encoder-decoder with KV cache)

License

This model is licensed under the Apache License 2.0. Original model by Gaussalgo, base model by Google.

Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train inference4j/T5-LM-Large-text2sql-spider