File size: 1,673 Bytes
fc35a48
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## How to Fine-tune DeepSeek-Coder

We provide script `finetune_deepseekcoder.py` for users to finetune our models on downstream tasks.

The script supports the training with [DeepSpeed](https://github.com/microsoft/DeepSpeed). You need install required packages by:

```bash

pip install -r requirements.txt

```

Please follow [Sample Dataset Format](https://huggingface.co/datasets/nickrosh/Evol-Instruct-Code-80k-v1) to prepare your training data.
Each line is a json-serialized string with two required fields `instruction` and `output`.

After data preparation, you can use the sample shell script to finetune `deepseek-ai/deepseek-coder-6.7b-instruct`. 
Remember to specify `DATA_PATH`, `OUTPUT_PATH`.
And please choose appropriate hyper-parameters(e.g., `learning_rate`, `per_device_train_batch_size`) according to your scenario.

```bash

DATA_PATH="<your_data_path>"

OUTPUT_PATH="<your_output_path>"

MODEL_PATH="deepseek-ai/deepseek-coder-6.7b-instruct"



deepspeed finetune_deepseekcoder.py \

    --model_name_or_path $MODEL_PATH \

    --data_path $DATA_PATH \

    --output_dir $OUTPUT_PATH \

    --num_train_epochs 3 \

    --model_max_length 1024 \

    --per_device_train_batch_size 16 \

    --per_device_eval_batch_size 1 \

    --gradient_accumulation_steps 4 \

    --evaluation_strategy "no" \

    --save_strategy "steps" \

    --save_steps 100 \

    --save_total_limit 100 \

    --learning_rate 2e-5 \

    --warmup_steps 10 \

    --logging_steps 1 \

    --lr_scheduler_type "cosine" \

    --gradient_checkpointing True \

    --report_to "tensorboard" \

    --deepspeed configs/ds_config_zero3.json \

    --bf16 True

```