axm92
axm92 t1_j6tw995 wrote
Reply to comment by LetterRip in [R] Faithful Chain-of-Thought Reasoning by starstruckmon
Thanks! Can you please clarify what do you mean by prompts are specific to the datasets for PaL?
​
As an example, for the ~10 math reasoning datasets used in PaL, identical prompts were used (same prompt for all datasets, without changing anything). The prompts/code is also open sourced at https://reasonwithpal.com/ if you want to check if out!
Incidentally, the idea that Python programs lead to faithful reasoning chains was used in PaL to create a new split of GSM, called GSM-hard. GSM-hard is available on huggingface.
​
(I'm a co-author of the PaL paper. )
axm92 t1_j5b2ug8 wrote
Reply to comment by dancingnightly in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229
I’m not sure if I understand you, but you can generate these graphs over long documents, and then run a GNN.
For creating graphs over long documents, one trick I’ve used in my past papers is to create a graph per 3 paragraphs, and then merge these graphs (by fusing similar nodes).
axm92 t1_j58761h wrote
Reply to comment by dancingnightly in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229
>Could it be extended from sentences to whole context paragraphs at some stage, with the entities dynamically being different graph items?
Absolutely. Highly recommend that you try playing around with some examples here: https://beta.openai.com/playground.
axm92 t1_j57jfrk wrote
Reply to comment by Low-Mood3229 in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229
>My use case is more classification of datapoints(containing many seemingly unimportant features that may or may not have some relationship to each other. Relationships that are captured in the knowledge graph
Sounds eerily close to one of our paper: https://aclanthology.org/2021.emnlp-main.508.pdf
To solve commonsense reasoning questions, we first generate a graph that can capture relationship between entities in the question (if you're thinking "chain-of-thought" prompting--yes, the idea is similar). Then, we jointly train a mixture-of-experts model with a classifier (RoBERTa) to do three things: i) learn to discard useless nodes, ii) pool node representations from useful nodes into a single graph embedding, and iii) classify using question + graph embeddings.
​
This video may give a good TLDR too.
axm92 t1_izxea0a wrote
axm92 t1_izwes71 wrote
Reply to [D] - Has Open AI said what ChatGPT's architecture is? What technique is it using to "remember" previous prompts? by 029187
I have a theory and a colab notebook: https://twitter.com/aman_madaan/status/1599549721030246401?s=46&t=44Qgnk8MlscEL9q91BWdDA
Similar findings: https://twitter.com/sjwhitmore/status/1601254826947784705?s=46&t=44Qgnk8MlscEL9q91BWdDA
axm92 t1_ito7hmf wrote
In case you’re interested, I have a minimal implementation here: https://github.com/madaan/minimal-text-diffusion
axm92 t1_j6uf2a7 wrote
Reply to comment by LetterRip in [R] Faithful Chain-of-Thought Reasoning by starstruckmon
Ah I see, thanks for clarifying. I see your point, but I wouldn't say that the prompts require an extensive knowledge of the test set. After all:
> As an example, for the ~10 math reasoning datasets used in PaL, identical prompts were used (same prompt for all datasets, without changing anything).
​
Notably, take a look at the section on GSM-hard (4.1). You may also enjoy the analysis in the new version of the paper (Section 6: https://arxiv.org/pdf/2211.10435.pdf).
​
Further, "Let's think step by step" is outperformed by "Write Python code to solve this." We'll add the numbers in the next version, but if you are interested please lmk and I can share the results earlier.
Thanks again for reading our work and sharing your feedback, I really appreciate it.