Lajamerr_Mittesdine
Lajamerr_Mittesdine t1_j1fuu1j wrote
Reply to comment by A1-Delta in [P] App that Determines Whether You've Been Naughty or Nice Based on Your Reddit Comments by Steven_Johnson34
It's 2022. Everyone should be having their own email address dumps with their domain name.
For example with Google Domains you can easily spin up 100 email addresses forwarded to your main mailbox no extra charge. Comes with your yearly domain renewal.
I create emails for each service I use.
reddit@mydomain.com , google@mydomain.com , walmart@mydomain.com
I can just create an email called junktest@mydomain.com
And if it ever gets to spammy you can just delete that email from the list and it won't get forwarded to your main inbox
Lajamerr_Mittesdine OP t1_itomfs6 wrote
Reply to comment by ReasonablyBadass in [R] Large Language Models Can Self-Improve by Lajamerr_Mittesdine
CoT simply breaks down a problem into multiple interconnected solution statements to arrive at one conclusive answer.
You can prompt a CoT Model to go down different reasoning structures and arrive at different answers(but sometimes wrong) but those are all independent from one another.
Note that this is fine-tuning an existing LLM.
This fine-tuning is in part done by a hypermodel that helps rank solutions. These solutions are then used to fine-tune the model even further to become better reasoners using its own generated answers.
So the model uses its own understandings to generate CoT solution statements. The hypermodel would rank those statements and then the existing model can be fine-tuned on the newly generated positive and negative solutions reinforcing the idea of what correct solution statements look like and what negative ones look like as well.
Future work: So what is limiting the LLM model from eventually getting to 100%~ ? The bottleneck from preventing this going exponential is the hypermodel that can accurately rank the solution. Theoretically if you had a perfect ranker blackbox you could eventually get to 100%~. So what you would want in future work is either just a more accurate ranker overall or someway to continuously improve the ranker hypermodel in an unsupervised fashion just like we have this hypermodel for the LLM.
Personal Opinion: So what this really is doing is just solving some low hanging fruit in prompting the LLM in reasonings it already understands in different contexts and more finely puts them as the highest ranking solutions across a broader range. It's not learning new concepts entirely.
Submitted by Lajamerr_Mittesdine t3_ycipui in MachineLearning
Lajamerr_Mittesdine t1_itgeiy5 wrote
Reply to [D] Building the Future of TensorFlow by eparlan
Can someone that has both the perspective of using TensorFlow and using Pytorch give their opinions why you would / wouldn't use each?
And how this announcement changes things for you.
Lajamerr_Mittesdine t1_isa34ib wrote
Reply to comment by Co0k1eGal3xy in [R] Mind's Eye: Grounded Language Model Reasoning through Simulation - Google Research 2022 by Singularian2501
All the answers are incomplete because they don't provide the assumptions necessary to arrive at a complete solution.
A more complete answer would look like this.
>Assuming just gravitational forces both the lighter and heavier baseballs both would fall at the same rate and then reach the surface at approximately the same time. This can be impacted however by additional forces that may be present such as an atmosphere providing additional resistances based on the surface area, density, and total mass of each object.
Though even that is an incomplete answer.
Lajamerr_Mittesdine t1_is9wk60 wrote
Reply to comment by ThrowThisShitAway10 in [D] Simple Questions Thread by AutoModerator
That paper is exactly what I needed. So many good details in there. Thank you so much!
Lajamerr_Mittesdine t1_is89dnx wrote
Reply to [D] Simple Questions Thread by AutoModerator
I have a project idea and would like some feedback on feasibility.
I want to create a ML model that I would use in a subsequent model training loop.
This first model would take a image of x by x dimensions as input and then output instructions to a custom Image Creation tool for steps of re-creating the image.
The instructions would be semi-human readable but mostly just for the program to interpret and would look like the following and be arguments for the custom image creation tool to take in.
> >412, 123 #FF00FF ----- This would turn this one pixel Fuschia > >130, 350 ; 150, 400 #000000 ----- This would turn this rectangle of pixels on the canvas to black.
And many more complex tools available to take in as arguments.
The reward function would have two stages. The first stage is how close is your image to the original which would be easy to compute. And the second stage reward function would reward instruction minimization. I.E. 5000 steps to recreate the image would be rewarded higher than 10000 steps.
It would also be easy to set the upper bound of recreating the image to the total pixel count for that image so that it can be killed if it reaches the limit without creating the 1:1 image it was given as input.
The program would also allow as input argument the ability to create custom functions. Which we would also the model the ability to do. One thing that would incentivize the model to create and use its custom functions is that the reward would be tweaked so that if the model uses a predefined function it creates it counts as less instructions than if it were to individually call those instructions.
This first model is all about training it to recreate images 1:1 in the least amount of discrete instructions as possible for any arbitrary image.
This model/program would then be used in a second models training loop which I would like to keep secret for now.
Lajamerr_Mittesdine t1_jc5b99n wrote
Reply to comment by light24bulbs in [P] Discord Chatbot for LLaMA 4-bit quantized that runs 13b in <9 GiB VRAM by Amazing_Painter_7692
I imagine 1 token per 0.2 seconds would be fast enough. That'd be equivalent to a 60 WPM typer.
Someone should benchmark it on an AMD 7950X3D or Intel 13900-KS