axm92 t1_j6uf2a7 wrote on February 1, 2023 at 11:23 PM

Reply to comment by LetterRip in [R] Faithful Chain-of-Thought Reasoning by starstruckmon

Ah I see, thanks for clarifying. I see your point, but I wouldn't say that the prompts require an extensive knowledge of the test set. After all:

> As an example, for the ~10 math reasoning datasets used in PaL, identical prompts were used (same prompt for all datasets, without changing anything).

Notably, take a look at the section on GSM-hard (4.1). You may also enjoy the analysis in the new version of the paper (Section 6: https://arxiv.org/pdf/2211.10435.pdf).

Further, "Let's think step by step" is outperformed by "Write Python code to solve this." We'll add the numbers in the next version, but if you are interested please lmk and I can share the results earlier.

Thanks again for reading our work and sharing your feedback, I really appreciate it.

axm92 t1_j6tw995 wrote on February 1, 2023 at 9:20 PM

Reply to comment by LetterRip in [R] Faithful Chain-of-Thought Reasoning by starstruckmon

Thanks! Can you please clarify what do you mean by prompts are specific to the datasets for PaL?

As an example, for the ~10 math reasoning datasets used in PaL, identical prompts were used (same prompt for all datasets, without changing anything). The prompts/code is also open sourced at https://reasonwithpal.com/ if you want to check if out!

Incidentally, the idea that Python programs lead to faithful reasoning chains was used in PaL to create a new split of GSM, called GSM-hard. GSM-hard is available on huggingface.

(I'm a co-author of the PaL paper. )

axm92 t1_j5b2ug8 wrote on January 21, 2023 at 6:29 PM

Reply to comment by dancingnightly in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229

I’m not sure if I understand you, but you can generate these graphs over long documents, and then run a GNN.

For creating graphs over long documents, one trick I’ve used in my past papers is to create a graph per 3 paragraphs, and then merge these graphs (by fusing similar nodes).

axm92 t1_j58761h wrote on January 21, 2023 at 2:06 AM

Reply to comment by dancingnightly in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229

>Could it be extended from sentences to whole context paragraphs at some stage, with the entities dynamically being different graph items?

Absolutely. Highly recommend that you try playing around with some examples here: https://beta.openai.com/playground.

axm92 t1_j57jfrk wrote on January 20, 2023 at 11:10 PM

Reply to comment by Low-Mood3229 in [R] Is there a way to combine a knowledge graph and other types of data for ML purposes? by Low-Mood3229

>My use case is more classification of datapoints(containing many seemingly unimportant features that may or may not have some relationship to each other. Relationships that are captured in the knowledge graph

Sounds eerily close to one of our paper: https://aclanthology.org/2021.emnlp-main.508.pdf

To solve commonsense reasoning questions, we first generate a graph that can capture relationship between entities in the question (if you're thinking "chain-of-thought" prompting--yes, the idea is similar). Then, we jointly train a mixture-of-experts model with a classifier (RoBERTa) to do three things: i) learn to discard useless nodes, ii) pool node representations from useful nodes into a single graph embedding, and iii) classify using question + graph embeddings.

This video may give a good TLDR too.

axm92 t1_izxea0a wrote on December 12, 2022 at 3:44 PM

Reply to comment by ahtoshkaa2 in [D] - Has Open AI said what ChatGPT's architecture is? What technique is it using to "remember" previous prompts? by 029187

Yeah I think so too https://twitter.com/aman_madaan/status/1599549721030246401?s=20&t=z7RKN9wK5lbfIoiGdHXDTg

axm92 t1_izwes71 wrote on December 12, 2022 at 9:58 AM

Reply to [D] - Has Open AI said what ChatGPT's architecture is? What technique is it using to "remember" previous prompts? by 029187

I have a theory and a colab notebook: https://twitter.com/aman_madaan/status/1599549721030246401?s=46&t=44Qgnk8MlscEL9q91BWdDA

axm92 t1_ito7hmf wrote on October 25, 2022 at 2:12 AM

Reply to [D] would diffusion language models make sense? by hapliniste

In case you’re interested, I have a minimal implementation here: https://github.com/madaan/minimal-text-diffusion