Instructions to use Truthseeker87/solarhive-26b-a4b-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Truthseeker87/solarhive-26b-a4b-merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Truthseeker87/solarhive-26b-a4b-merged")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Truthseeker87/solarhive-26b-a4b-merged")
model = AutoModelForImageTextToText.from_pretrained("Truthseeker87/solarhive-26b-a4b-merged")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Truthseeker87/solarhive-26b-a4b-merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Truthseeker87/solarhive-26b-a4b-merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Truthseeker87/solarhive-26b-a4b-merged",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Truthseeker87/solarhive-26b-a4b-merged

SGLang

How to use Truthseeker87/solarhive-26b-a4b-merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Truthseeker87/solarhive-26b-a4b-merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Truthseeker87/solarhive-26b-a4b-merged",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Truthseeker87/solarhive-26b-a4b-merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Truthseeker87/solarhive-26b-a4b-merged",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Unsloth Studio new

How to use Truthseeker87/solarhive-26b-a4b-merged with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Truthseeker87/solarhive-26b-a4b-merged to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Truthseeker87/solarhive-26b-a4b-merged to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Truthseeker87/solarhive-26b-a4b-merged to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Truthseeker87/solarhive-26b-a4b-merged",
    max_seq_length=2048,
)

Docker Model Runner
How to use Truthseeker87/solarhive-26b-a4b-merged with Docker Model Runner:
```
docker model run hf.co/Truthseeker87/solarhive-26b-a4b-merged
```

SolarHive 26B A4B Merged — Community Solar Energy Intelligence

Overview

SolarHive 26B A4B Merged is the production-ready version of solarhive-26b-a4b-lora — LoRA adapters pre-merged into the base weights for direct loading with AutoModelForCausalLM (no Unsloth or PEFT needed at inference time).

It is a LoRA fine-tuned Gemma 4 26B A4B (MoE) model specialized in community solar energy intelligence with native function calling, multimodal VQA, and selective tool reasoning.

Key Features:

Domain expertise in solar production, battery management, grid optimization, and community coordination
Multimodal visual question answering (sky analysis, panel inspection, neighborhood assessment)
Native function calling for 4 energy-specific tools
Grounded responses referencing real API data
No Unsloth dependency — loads with standard transformers

Mission

SolarHive is an open-source intelligence layer designed to coordinate community microgrids & community-based storage via fuel cells, pool midday energy surplus across these microgrids, and eliminate stranded capacity. It also helps forecast solar irradiance and cloud cover to plan ahead.

Why 26B A4B? — Model Architecture Selection

Gemma 4 offers four model sizes. We evaluated all four and selected two complementary architectures for a dual fine-tune strategy — one for cloud inference (this model), one for edge deployment (E4B):

Model	Params (Total / Active)	Architecture	Vision Encoder	Context	Modalities	Selection
E2B	5.1B / 2.3B effective	Dense + PLE	~150M	128K	Text, Image, Audio, Video	Ollama serving target
E4B	8B / 4.5B effective	Dense + PLE	~150M	128K	Text, Image, Audio, Video	Fine-tuned for edge
26B A4B	25.2B / 3.8B active	MoE (8/128)	~550M	256K	Text, Image	This model — cloud inference
31B	30.7B / 30.7B	Dense	~550M	256K	Text, Image	Rejected

SolarHive requires two core capabilities: multimodal VQA (analyzing sky photos and panel images) and native function calling (invoking weather, solar, battery, and grid APIs in agentic loops). The official benchmarks show why 26B A4B delivers the best capability-to-cost ratio:

Benchmark	SolarHive Use Case	E4B	26B A4B	31B
MMMU Pro (vision)	Sky/panel VQA analysis	52.6%	73.8%	76.9%
MATH-Vision	Visual reasoning on solar data	59.5%	82.4%	85.6%
OmniDocBench (lower=better)	Document understanding	0.181	0.149	0.131
MMLU Pro	Domain expertise (energy advisory)	69.4%	82.6%	85.2%
GPQA Diamond	Scientific reasoning	58.6%	82.3%	84.3%
MRCR v2 128K	Multi-round tool-calling context	25.4%	44.1%	66.4%

Source: Gemma 4 Model Card. All four models support native function calling and agentic workflows.

Why 26B A4B wins for SolarHive:

~550M vision encoder delivers 73.8% MMMU Pro — 40% better than E4B (52.6%) for sky/panel VQA, only 4% below 31B
MoE sparse activation (3.8B active of 25.2B) achieves ~95% of 31B quality at a fraction of the compute
256K context window accommodates multi-round agentic tool-calling loops (4 API calls per turn)
Best domain absorption — converged loss 0.6956 vs E4B's 0.9218 on the same training corpus

Why not 31B? Only 2–3% better on vision and reasoning but 2–4x more compute and VRAM. Not worth the cost for a community energy advisor.

Benchmark Results

Domain Q&A (5/5)

All domain questions answered correctly:

Solar production impact of humidity/weather
Battery management optimization
Diagnostic troubleshooting
Seasonal planning
Grid frequency interpretation

Production Benchmark (8/8) — Agentic Loop

When evaluated with tool schemas in context (BF16):

5/5 Q&A correct
3/3 tool calling correct (get_battery_state, get_weather, selective tool reasoning)

Multi-Variant Deployment Validation (Final Run, May 2026)

End-to-end inference run on Colab Pro G4 (NVIDIA RTX PRO 6000 Blackwell, 102 GB VRAM total). This A4B BF16 merged variant was loaded from a local cache (52.4 GB VRAM utilization) via AutoModelForCausalLM.from_pretrained(..., dtype=torch.bfloat16) — no BitsAndBytesConfig.

Score: 5/5 Q&A + 4/5 tool = 9/10 on the 10-question parity benchmark.

The single FAIL is the lenient multi-call probe — "Compare today's irradiance forecast across Ann Arbor, Phoenix, and Seattle" (min_calls=2) — where this variant returned no tool call. The same multi-call failure appears on 4 of 5 measured variants in this run; only the E4B LoRA + base variant chained the multi-city calls (3 × get_weather). Worth a multi-trial re-run to characterize whether this is stochastic at temperature=1.0 or systematic.

Outputs match the LoRA + base baseline closely — both this merged variant and the LoRA + base variant produce textually similar Q&A answers (e.g., the 22% underperformance diagnostic checklist is structurally identical), confirming the merge step is lossless.

When2Call score: 3/3 — inferred from the A4B LoRA baseline. The When2Call probe suite was directly measured on the A4B LoRA baseline — score 3/3 — and on the E4B merged variant — score 2/3. This A4B merged variant inherits the 3/3 score by mathematical lossless equivalence: save_pretrained_merged("merged_16bit") produces standalone BF16 safetensors with identical numerical content to the LoRA + base load, so the refusal/follow-up decision boundary is unchanged. We label this score inferred (not directly measured in the May 2026 inference run) to distinguish it from the directly-measured 3/3 on A4B LoRA. Compare to the E4B family (solarhive-e4b-lora + solarhive-e4b-ollama) which scores 2/3 (fails (d) by calling get_weather for an air-quality question).

The +1/3 When2Call delta between A4B and E4B families is the empirical signature of size-vs-refusal scaling. A4B outperforming the smaller E4B fine-tune on reasoning-heavy probes was the pre-stated hypothesis per the official Google Gemma 4 docs "Models with higher parameters and bit counts are generally more capable" — this 26B A4B accesses ~25B total knowledge capacity (3.8B active per token via MoE sparsity) and a ~550M vision encoder vs E4B's 8B / 4.5B effective / ~150M.

Key Specifications

Parameter	Value
Base Model	google/gemma-4-26b-a4b-it
Architecture	MoE — 25.2B total, 3.8B active (8/128 experts)
Modalities	Text + Image
Context Length	256K tokens
Fine-Tuning Method	LoRA via Unsloth (BF16), merged to 16-bit (`merged_16bit`)
Training Data	1,727 examples (solarhive-community-solar-multimodal) — text-only fine-tune; VQA at inference uses the base Gemma 4 vision encoder (~550M params), unmodified by our LoRA per the Vertex AI SFT recipe
Converged Loss	0.6956
Benchmark Score	9/10 (5/5 domain Q&A + 4/5 tool calling) — May 2026 final run, multi-call regression on TQ5 (see Multi-Variant Deployment Validation below)
Precision	BF16 (~48 GB)
License	MIT (adapters) / Gemma Terms (base model)

Precision Note — BF16 is Gemma 4's Native Release Format

BF16 is Google's native release precision for Gemma 4. The open-source base model at google/gemma-4-26b-a4b-it is itself published in BF16 — there is no FP32 release to begin with. This merged variant preserves that precision exactly: the LoRA fine-tuning delta is folded into the base weights at BF16 via Unsloth's save_method="merged_16bit". The result is a single safetensors artifact with the same numerical precision as the open-source base plus the SolarHive fine-tune delta — not a quantization downgrade.

Comparison	Precision	Source
Google's open-source Gemma 4 26B A4B base	BF16	google/gemma-4-26b-a4b-it
This merged variant (base + SolarHive LoRA folded in)	BF16 (same as base)	This repo
Quantized variant (4-bit packed)	NF4	solarhive-26b-a4b-nf4

Why publish a merged BF16 artifact rather than just the LoRA adapters? Two reasons: (1) downstream consumers (HF Spaces, NF4 quantization, evaluation pipelines) can from_pretrained(...) directly without a PEFT/Unsloth dependency at inference time; (2) the merged artifact is the canonical input for the NF4 quantization pipeline — it guarantees the quantized weights derive from the same BF16 numerics the BF16 benchmark validated.

Training Details

Parameter	Value
Method	LoRA via Unsloth FastVisionModel (BF16, RTX PRO 6000 Blackwell 102 GB)
LoRA rank	16
LoRA alpha	16
Learning rate	2e-4
Epochs	3
Max sequence length	2048
Precision	BF16
Trainable parameters	505.4M / 26.3B (1.92%)
Training time	7,198 seconds (~120 minutes)
Hardware	Google Colab Pro (NVIDIA RTX PRO 6000 Blackwell)

Training Data — 1,727 Examples

Canonical training corpus: solarhive-community-solar-multimodal:

413 hand-crafted examples across 15+ US cities, 9 energy domains
~1,117 API-grounded examples from Open-Meteo, PVWatts, OpenWeatherMap, EIA
183 tool-calling examples following the When2Call taxonomy — 106 should-call, 53 should-not-call, 10 unable-to-answer, 6 follow-up clarification, 8 failure-recovery
14 image-grounded Q&A turns from 7 manually-labeled Ann Arbor sky photographs

Fine-tuning is text-only on the multimodal-capable corpus (image rows skipped at the data-prep layer). VQA at inference uses the base Gemma 4 26B A4B model's pretrained vision encoder (~550M params per the official model card). Our LoRA targets only the language-model linear layers (target=all-linear); the vision tower is unmodified, matching the Vertex AI Gemma 4 SFT recipe documented in the Hugging Face blog, which explicitly freezes both vision and audio towers during text-focused fine-tuning.

How to Use

Loading with Transformers (No Unsloth Needed)

This is the merged model — LoRA weights are baked into the base weights. Load directly with transformers:

from transformers import AutoProcessor, AutoModelForCausalLM
import torch

processor = AutoProcessor.from_pretrained(
    "google/gemma-4-26b-a4b-it",
    trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
    "Truthseeker87/solarhive-26b-a4b-merged",
    dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

Two-Step Tokenization (Required)

messages = [
    {"role": "system", "content": "You are SolarHive, an AI energy advisor for a community of 12 homes with rooftop solar and shared battery storage in Ann Arbor, Michigan."},
    {"role": "user", "content": "How will today's weather affect our solar production?"},
]

# Step 1: render text (tokenize=False)
text = processor.apply_chat_template(
    messages, tools=tools,
    add_generation_prompt=True,
    enable_thinking=False,
    tokenize=False,
)

# Step 2: tokenize separately
inputs = processor(text=text, images=None, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=1024, temperature=1.0, top_p=0.95, top_k=64)
response = processor.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)

Native Function Calling

def get_weather(location: str) -> dict:
    """Get current weather conditions for a location.

    Args:
        location: City name, e.g. 'Ann Arbor, MI'

    Returns:
        dict with temp_f, clouds_pct, wind_mph, humidity, sunrise, sunset
    """
    ...

def get_solar_production(clouds_pct: int, temp_f: float) -> dict:
    """Get estimated community solar production using GHI irradiance data.

    Args:
        clouds_pct: Cloud cover percentage (0-100)
        temp_f: Temperature in Fahrenheit

    Returns:
        dict with production_kw, capacity_kw, efficiency_pct, ghi_wm2
    """
    ...

tools = [get_weather, get_solar_production, get_battery_state, get_grid_status]

text = processor.apply_chat_template(
    messages, tools=tools,
    add_generation_prompt=True,
    enable_thinking=False,
    tokenize=False,
)

Core Capabilities

1. Multimodal Visual Question Answering (3 Modes)

Mode	Input	Output
Sky Analysis	Sky photograph	Cloud coverage %, production forecast, storage recommendation
Panel Inspection	Panel photograph	Dirt/damage/shading detection, efficiency impact estimate
Neighborhood Assessment	Aerial/satellite image	Panel inventory, expansion priorities, shading analysis

2. Native Function Calling (5 Tools — all 3 keyed APIs wired)

Tool	API	Returns
`get_weather(location)`	OpenWeatherMap (`OWM_API_KEY`)	Temperature, clouds %, wind, humidity, sunrise/sunset
`get_solar_production(clouds_pct, temp_f)`	Open-Meteo GHI (keyless)	Production kW, efficiency %, GHI W/m², temp derating
`get_battery_state()`	Community BMS (sim)	State of charge, capacity, charging status
`get_grid_status()`	EIA Open Data (`EIA_API_KEY`)	Pricing period, rate/kWh, renewable %, CO2 intensity
`get_nrel_pvwatts_baseline()`	NREL PVWatts v8 (`NREL_API_KEY`)	Annual + current-month typical kWh + avg kW for the 72 kW array

Tool results feed back as a 2-message sequence matching the training distribution: {"role": "assistant", "tool_calls": [...]} then {"role": "tool", "name": "<fn>", "content": json.dumps(result)}. This format is shared across solarhive_datagen.py, solarhive_finetune.py, solarhive_inference.py Cell 4, and test_ollama_tools.py Solution B.

3. Selective Tool Reasoning

The model intelligently decides when to call tools:

"What time does peak pricing start?" → Calls: get_grid_status() only
"Is today's production above typical for January?" → Calls: get_solar_production() + get_nrel_pvwatts_baseline()
"Should I run my pool heater now?" → Calls: all 5 tools
"What are general maintenance tips?" → Calls: none

4. Inference-time When2Call Validation

Three held-out probes validate 3 of the 4 failure-mode categories from Ross, H., Mahabaleshwarkar, A. S., & Suhara, Y. (2025). When2Call: When (not) to Call Tools. arXiv:2504.18851 — the paper documents 9–67% tool-hallucination rates on (c)+(d) in untrained community models:

(b) "What's the current grid rate?" → expect get_grid_status call (well-specified, in-scope)
(c) "How much will a 10 kW array produce today?" → expect follow-up question (does NOT auto-fill location default)
(d) "What's the current air quality index in Ann Arbor?" → expect refusal + redirect (does NOT hallucinate a tool)

Models trained without explicit unable-to-answer and follow-up clarification examples typically fail (c) + (d). The SolarHive training corpus includes 16 such examples (10 unable-to-answer + 6 follow-up clarification) following the When2Call taxonomy; the A4B family achieves 3/3 on these probes (directly measured on A4B LoRA, inferred-lossless on this merged variant + on A4B NF4).

Community Model Specifications

Parameter	Value
Location	Ann Arbor, Michigan (42.2808°N, 83.7430°W)
Community size	12 homes
Total panel capacity	72 kW
Shared battery storage	100 kWh
Grid region	MISO (Midcontinent Independent System Operator)

Technical Notes

Merged model: LoRA adapters pre-merged into base weights via Unsloth save_pretrained_merged("merged_16bit") — no PEFT/Unsloth needed at inference
Processor from base model: Use AutoProcessor.from_pretrained("google/gemma-4-26b-a4b-it") — the base model's processor has the correct chat template with native tool-call support
Two-step tokenization: Single-step tokenize=True crashes in transformers 5.5.x on messages without a content key — always use the two-step approach
System prompt repetition: Repeated system prompt improves instruction following (Leviathan et al., 2024)
VRAM requirements: ~48 GB in BF16 — fits on A100-80GB, H100, ZeroGPU H200, or RTX PRO 6000
Sampling: temperature=1.0, top_p=0.95, top_k=64 (Kaggle-recommended defaults)

Limitations

Prototype tested on single community (12 homes, Ann Arbor) — validation needed across geographies
Model occasionally uses "60 kW" instead of correct 72 kW capacity in direct VQA responses
Tool responses depend on external API availability with rate limits
Battery state simulator is deterministic for demonstrations
Requires ~48 GB VRAM (BF16) — does not fit on consumer GPUs; use the E4B model for edge deployment

Future Iteration — Multi-Token Prediction (MTP) Drafters

Not in the measured numbers above. Google announced Gemma 4 MTP drafters on May 5, 2026 (blog, overview, HF collection, Kaggle, @GoogleGemma) — after this artifact's final benchmark was captured. The benchmarks above reflect standard autoregressive decoding only. MTP integration is documented here as future iteration; no measured speedup is claimed in this release.

Theoretical foundation. Speculative decoding (Leviathan, Kalman & Matias, ICML 2023, arXiv:2211.17192) accelerates generation without changing the output distribution under argmax decoding: a smaller drafter proposes γ candidate tokens, the target verifies all γ in a single parallel forward pass, accepted tokens are kept, and any rejection is resampled from a corrected distribution. The output distribution is preserved exactly regardless of drafter quality; only acceptance rate α, and therefore walltime speedup, varies.

What Google released on May 5, 2026. Paired drafter checkpoints for all four IT-tuned Gemma 4 variants — gemma-4-E2B-it-assistant, gemma-4-E4B-it-assistant, gemma-4-26B-A4B-it-assistant, gemma-4-31B-it-assistant — discoverable via the google/gemma-4 Hugging Face collection and on Kaggle Models. The drafters share the input embedding table with their paired target and consume the target's last-layer activations (architecture per the MTP overview). For this target the paired drafter is google/gemma-4-26B-A4B-it-assistant (0.4 B params). Google reports up to 3× decode speedup with no quality degradation on the 26B-A4B configuration, and **2.2×** on Apple Silicon at batch sizes 4–8. Tested runtimes named in the blog: LiteRT-LM, MLX, Hugging Face Transformers, vLLM, SGLang, Ollama.

Integration cost is one kwarg in Hugging Face Transformers — the future-iteration cell in solarhive_inference.py loads this merged target paired with the base-paired drafter directly:

target    = AutoModelForCausalLM.from_pretrained("Truthseeker87/solarhive-26b-a4b-merged", dtype=torch.bfloat16, ...)
assistant = AutoModelForCausalLM.from_pretrained("google/gemma-4-26B-A4B-it-assistant",   dtype=torch.bfloat16, ...)
target.generate(**inputs, assistant_model=assistant)  # MTP enabled

The integration ships as a gated future-iteration cell (§14, _RUN_MTP_DEMO = False); reviewers can flip the flag to reproduce a baseline-vs-MTP comparison under argmax decoding.

Open question specific to this LoRA-merged BF16 target. Per the 2023 speculative-sampling guarantee, correctness is invariant to drafter quality — the target's verification step preserves the exact output distribution regardless of what the drafter proposes. What varies is acceptance rate α, since Google's released drafter was trained against the base gemma-4-26B-A4B-it, not against this LoRA-merged target. Measured α and the resulting walltime speedup on this target are the planned post-hackathon contribution.

Companion Repositories

Model	Repository	Purpose
SolarHive 26B A4B Merged	This repo	Production inference — no Unsloth needed
SolarHive 26B A4B LoRA	solarhive-26b-a4b-lora	LoRA adapters for further fine-tuning
SolarHive 26B A4B NF4	solarhive-26b-a4b-nf4	Pre-quantized 4-bit cloud model for HF Spaces / 24 GB+ GPUs
SolarHive E4B LoRA	solarhive-e4b-lora	E4B adapter weights (~200 MB) — apply over base via Unsloth
SolarHive E4B safetensors	solarhive-e4b-ollama	Edge model — merged safetensors source for transformers research and GGUF conversion via llama.cpp
SolarHive E4B GGUF	solarhive-e4b-gguf	Edge deployment — Q4_K_M GGUF + mmproj for Ollama / llama.cpp on 16 GB CPU laptop. 10/10 benchmark.
SolarHive Dataset	solarhive-community-solar-multimodal	1,727 training examples (1,713 text + 14 image-grounded)
Live Demo	HF Space	Interactive Gradio demo
LiteRT-LM Python edge runtime	`solarhive_e4b_litert_v3.1.ipynb`	LiteRT Special Tech Track entry — runs upstream base `litert-community/gemma-4-E4B-it-litert-lm` `.litertlm` (3.66 GB) + SolarHive UX layer + on-device agentic loop. Q&A 8/8 on Colab Pro CPU + High-RAM. Fine-tuned LiteRT-LM bundle is a planned next iteration once upstream `gemma4` example module lands in `ai_edge_torch.generative.examples/`.
GitHub	the-gemma4-good-hackathon-solarhive	Full source code and notebooks

Data Pipeline Diagnostics

Training data quality validated with 14 diagnostic charts generated from live API data:

Solar Irradiance and Production



GHI distribution: Ann Arbor median 265 W/m² vs San Mateo 364 W/m² — Michigan receives ~27% less solar irradiance	Hourly production curve: Peak at 1-2pm. Ann Arbor peaks higher but with wider variance

Month x hour heatmaps: Ann Arbor peaks June-July at 45+ kW midday. San Mateo has broader, flatter production season	Temperature derating: Flat at 1.0 below 77°F, linear decline at 0.4%/°F above. Validates the derating formula

Environmental Correlations



Feature correlations: GHI to production r=0.97 (near-perfect). Humidity to GHI r=-0.57	Cloud cover by season: Ann Arbor consistently cloudier than San Mateo across all seasons

Seasonal production: Summer median ~33 kW (Ann Arbor) vs ~26 kW (San Mateo). Winter drops to ~12 kW	GHI vs production scatter: Clear-sky (tight linear) vs cloudy (scattered) — demonstrates direct vs diffuse radiation physics

Cross-Validation and Grid Analysis



Open-Meteo vs PVWatts: Strong seasonal agreement validates GHI formula against NREL industry standard	OWM snapshot: Temperature, clouds, wind, humidity at data generation time

Fuel mix: MISO (33.5% gas, 23.4% wind, 18.8% coal) vs CAISO (35.8% solar, 20.6% wind)	Renewable % and CO2: CISO hits 100% renewable at midday solar peaks; MISO ranges 20-50%

Atmospheric Decomposition



Irradiance decomposition: Total GHI split into direct-beam (DNI) and diffuse (DHI) on a clear summer day. Confirms training on physically-decomposed solar radiation, important for cloudy-day production estimates where diffuse dominates	Vertical cloud-cover stack: Composition by month (low <3 km / mid 3–8 km / high >8 km). Low stratus attenuates GHI more aggressively than high cirrus — exposes the model to seasonal shifts in cloud-layer composition

Citation

@misc{solarhive2026,
  title={SolarHive: AI-Powered Community Solar Energy Intelligence},
  author={Youshen Lim},
  year={2026},
  url={https://github.com/youshen-lim/the-gemma4-good-hackathon-solarhive},
  note={Gemma 4 Good Hackathon submission — Google DeepMind x Kaggle}
}

Gemma is a trademark of Google LLC.

Downloads last month: 41

Safetensors

Model size

27B params

Tensor type

BF16

Model tree for Truthseeker87/solarhive-26b-a4b-merged

Adapters

1 model

Dataset used to train Truthseeker87/solarhive-26b-a4b-merged

Space using Truthseeker87/solarhive-26b-a4b-merged 1

Papers for Truthseeker87/solarhive-26b-a4b-merged

Evaluation results

Accuracy
self-reported

1.000
Accuracy
self-reported

1.000