Updating the markdown to read from the dataset directly
Browse files
prompt_order_exeriment/pages/index.py
CHANGED
@@ -1,4 +1,9 @@
|
|
1 |
import reflex as rx
|
|
|
|
|
|
|
|
|
|
|
2 |
|
3 |
p1 = '''
|
4 |
# Prompt Order Experiment
|
@@ -19,21 +24,21 @@ This is our user message, we can see the question and answer choices
|
|
19 |
|
20 |
<details>
|
21 |
<summary>Click to show prompt!</summary>
|
|
|
|
|
22 |
|
23 |
-
```
|
24 |
-
|
25 |
-
Question: What is satellite technology used for predicting?
|
26 |
-
Answer Choices: (a) Seconds and minutes (b) The strength and magnitude of an earthquake (c) What it's like outside each day (d) 70-75 degrees fahrenheit (e) Rapid changes occur (f) Dead-ends and false starts. (g) Snow, ice, and rock (h) Around 5 to 27 degrees celsius[/INST]
|
27 |
```
|
28 |
|
29 |
This is our assistant message, you can see that we are forcing a JSON (note I added spacing for visual purposes), and we are putting the reasoning first. Using a JSON in fine-tuning will improve our structured generation results as the model will get used to responding in that "space".
|
30 |
-
```
|
31 |
-
{
|
32 |
-
"Reasoning": "a) Seconds and minutes: This option is incorrect because satellite technology is not used for predicting time intervals. Satellite technology is used for various purposes such as communication, navigation, and weather forecasting, but it is not used for predicting time intervals.\n\nb) The strength and magnitude of an earthquake: This option is incorrect because satellite technology is not used for predicting earthquakes. Earthquake prediction is a complex process that involves seismology and other scientific methods, but satellite technology is not one of them.\n\nd) 70-75 degrees Fahrenheit: This option is incorrect because satellite technology is not used for predicting specific temperature ranges. While satellite technology can provide temperature data, it is not used for predicting specific temperature ranges.\n\ne) Rapid changes occur: This option is too vague and does not provide enough information to determine whether it is correct or not. Satellite technology can be used to monitor changes in various environmental factors, but it is not used specifically for predicting rapid changes.\n\nf) Dead-ends and false starts: This option is incorrect because it is not related to satellite technology or any type of prediction.\n\ng) Snow, ice, and rock: This option is incorrect because it is too specific and does not cover the broad range of predictions that satellite technology can be used for. While satellite technology can be used to monitor snow, ice, and rock formations, it is not used exclusively for this purpose.\n\nh) Around 5 to 27 degrees Celsius: This option is incorrect because it is too specific and does not cover the broad range of temperature predictions that satellite technology can be used for. While satellite technology can provide temperature data, it is not used exclusively for predicting temperatures within a specific range.\n\nTherefore, the correct answer is c) what it's like outside each day, as satellite technology is commonly used for weather forecasting and predicting daily weather conditions.",
|
33 |
-
"Final Answer": "c"
|
34 |
-
}</seq>
|
35 |
```
|
36 |
</details>
|
|
|
|
|
|
|
37 |
|
38 |
### **Scenario 2: Q - AC - FA - R** (Falcon and GPT3.5)
|
39 |
|
@@ -41,19 +46,16 @@ An awkward order, placing reasoning after the final answer. While it is faster,
|
|
41 |
|
42 |
<details>
|
43 |
<summary>Click to show prompt!</summary>
|
44 |
-
```
|
45 |
-
|
46 |
-
Question: What is satellite technology used for predicting?
|
47 |
-
Answer Choices: (a) Seconds and minutes (b) The strength and magnitude of an earthquake (c) What it's like outside each day (d) 70-75 degrees fahrenheit (e) Rapid changes occur (f) Dead-ends and false starts. (g) Snow, ice, and rock (h) Around 5 to 27 degrees celsius[/INST]
|
48 |
```
|
49 |
|
50 |
-
```
|
51 |
-
{
|
52 |
-
'Final Answer': 'c',
|
53 |
-
'Reasoning': "a) Seconds and minutes: This option is incorrect because satellite technology is not used for predicting time intervals. Satellite technology is used for various purposes such as communication, navigation, and weather forecasting, but it is not used for predicting time intervals.\n\nb) The strength and magnitude of an earthquake: This option is incorrect because satellite technology is not used for predicting earthquakes. Earthquake prediction is a complex process that involves seismology and other scientific methods, but satellite technology is not one of them.\n\nd) 70-75 degrees Fahrenheit: This option is incorrect because satellite technology is not used for predicting specific temperature ranges. While satellite technology can provide temperature data, it is not used for predicting specific temperature ranges.\n\ne) Rapid changes occur: This option is too vague and does not provide enough information to determine whether it is correct or not. Satellite technology can be used to monitor changes in various environmental factors, but it is not used specifically for predicting rapid changes.\n\nf) Dead-ends and false starts: This option is incorrect because it is not related to satellite technology or any type of prediction.\n\ng) Snow, ice, and rock: This option is incorrect because it is too specific and does not cover the broad range of predictions that satellite technology can be used for. While satellite technology can be used to monitor snow, ice, and rock formations, it is not used exclusively for this purpose.\n\nh) Around 5 to 27 degrees Celsius: This option is incorrect because it is too specific and does not cover the broad range of temperature predictions that satellite technology can be used for. While satellite technology can provide temperature data, it is not used exclusively for predicting temperatures within a specific range.\n\nTherefore, the correct answer is c) what it's like outside each day, as satellite technology is commonly used for weather forecasting and predicting daily weather conditions."
|
54 |
-
}</seq>
|
55 |
```
|
56 |
</details>
|
|
|
|
|
57 |
|
58 |
### **Scenario 3: Q - AC - FA**
|
59 |
|
@@ -70,5 +72,5 @@ Structured generation ensures consistent response formats, which is crucial for
|
|
70 |
|
71 |
def page():
|
72 |
return rx.vstack(
|
73 |
-
rx.markdown(p1),
|
74 |
)
|
|
|
1 |
import reflex as rx
|
2 |
+
from datasets import load_dataset
|
3 |
+
|
4 |
+
dataset = load_dataset("derek-thomas/labeled-multiple-choice-explained-falcon-tokenized", split='train')
|
5 |
+
df = dataset.to_pandas()
|
6 |
+
|
7 |
|
8 |
p1 = '''
|
9 |
# Prompt Order Experiment
|
|
|
24 |
|
25 |
<details>
|
26 |
<summary>Click to show prompt!</summary>
|
27 |
+
'''
|
28 |
+
p2 = f'''
|
29 |
|
30 |
+
```json
|
31 |
+
{df['conversation_RFA_gpt3_5'].iloc[0][0]}
|
|
|
|
|
32 |
```
|
33 |
|
34 |
This is our assistant message, you can see that we are forcing a JSON (note I added spacing for visual purposes), and we are putting the reasoning first. Using a JSON in fine-tuning will improve our structured generation results as the model will get used to responding in that "space".
|
35 |
+
```json
|
36 |
+
{df['conversation_RFA_gpt3_5'].iloc[0][1]}
|
|
|
|
|
|
|
37 |
```
|
38 |
</details>
|
39 |
+
'''
|
40 |
+
|
41 |
+
p3 = f'''
|
42 |
|
43 |
### **Scenario 2: Q - AC - FA - R** (Falcon and GPT3.5)
|
44 |
|
|
|
46 |
|
47 |
<details>
|
48 |
<summary>Click to show prompt!</summary>
|
49 |
+
```json
|
50 |
+
{df['conversation_FAR_gpt3_5'].iloc[0][0]}
|
|
|
|
|
51 |
```
|
52 |
|
53 |
+
```json
|
54 |
+
{df['conversation_FAR_gpt3_5'].iloc[0][1]}
|
|
|
|
|
|
|
55 |
```
|
56 |
</details>
|
57 |
+
'''
|
58 |
+
p4 = '''
|
59 |
|
60 |
### **Scenario 3: Q - AC - FA**
|
61 |
|
|
|
72 |
|
73 |
def page():
|
74 |
return rx.vstack(
|
75 |
+
rx.markdown(p1+p2+p3+p4),
|
76 |
)
|