... darkrealms ...

Forums before death by AOL, social media and spammers... "We can't have nice things"
alt.comp.freeware
Generic free software discussions
39,988 messages
[ << oldest | < older | list | newer > | newest >> ]
Message 38,281 of 39,988
Paul to Marion
Re: Attached is a "conversation" I had w
09 Feb 25 23:46:18
   XPost: alt.comp.os.windows-10, rec.photo.digital   
   From: nospam@needed.invalid   
      
   On Sun, 2/9/2025 6:20 PM, Marion wrote:   
   > Don't read this line by line... but you might want to skim it quickly.   
   >   
   > I'm new to AI where I realized AI can help me figure out what ffmpeg   
   > commands to use when I need to slightly modify videos for posting.   
   >   
   > Normally I ask here - and Paul gives me the answer! :)   
   > But today, I shunned Paul in favor of my (new) good friend, Mr. AI!   
   >   
   > The transcript below shows how AI usually gives the wrong answer at   
   > first but you can hone that answer, little by little, to solve issues.   
   >   
   > Here's what happened:   
   > a. I needed to upload a video to Amazon Vine that was in two parts   
   > b. So all I needed to do was concatenate two short videos I took   
   >   (same camera, same everything)   
   > c. But the second video kept being rotated upside down (still is!)   
   >   
   > In desperation, I asked for AI to help solve the problem...   
   >  Q: Hey AI. What is the Windows ffmpeg command to rotate a video     180   
   degrees clockwise   
      
   ... snip session, to keep the response a bit shorter   
      
   > I know this is frustrating, but I'm confident we can find the cause.   
   > Please provide the information requested above, and we'll get to the   
   > bottom of this.   
      
   It's up to you, to decide what conversational format you use with the AI.   
      
   To start with, your question can start with a description of inputs,   
   the question, and a series of constraint lines.   
      
   Some of the properties we know of, is the models have limits on tokens,   
   and seemingly easily forget things while doing a symbolic manipulation.   
      
   One of the reasons you got as far as you did, is the quality assurance   
   stage likely kept cutting in and forcing it to go back and refine the   
   question. For each of your thirteen questions.   
      
   On the subject matter, FFMPEG is a garbage in garbage out tool :-)   
   I think you knew that before starting this exercise, is that   
   FFMPEG is hit or miss on things. Or, at least, the ability of a human   
   to guess exactly what parameters switched in, would give the response   
   you wanted.   
      
   The AI is probably right, that it needs the refinement of you   
   providing all available metadata from each file, to correct   
   what is happening.   
      
   For example, one way of doing that, would be to say "each video segment   
   was shot on an iPhone7 in RAW mode using the 20Mpixel front camera.   
   While holding the camera in tall formation, rather than wide formation".   
      
   The AI could then better guess at what metadata had been injected   
   into the video, by the camera. The AI would also have a better idea   
   that an iPhone shoots in High Profile and so on. This would reduce   
   some of the stabs in the dark it is making.   
      
   But my experience (not really a lot of questions) with the AI,   
   is you're damned if you do and damned if you don't. If you try to   
   "lead" the AI, by perhaps including the wrong kind of parameter   
   or construction as part of your input, the stupid thing will try   
   and make an answer that *includes* your guess. This is bad. On the   
   one hand, we don't want to pollute the problem space with   
   unnecessary observations. We do want to provide enough color   
   commentary, so it can guess what is wrong better.   
      
   If we were to grab three Youtube videos and try and splice them   
   together, there's no guarantee they have all been reduced to some   
   clean baseline before we get them. Whereas if the AI knows all   
   three were shot with the same (named) camera, it will at least   
   know for example, that the camera automatically includes the   
   rotation metadata, as a function of how you held the camera and so on.   
      
   But just again as a general comment, I expect every session with   
   an AI to go like this. It very much depends on "your own intelligence",   
   to turn the "story summarizer" into a "problem solver". It's not AGI,   
   it's not even remotely close to AGI.   
      
   Similar to a USENET thread, you'll notice how threads go to hell   
   as a function of missing details. The participants here are better   
   at guessing some things, but they will flounder (and sub-threads result),   
   when the answer is looking too broad.   
      
   My very first question of the Ai, illustrates this. As I was   
   sitting at the machine, I said to myself "got to avoid giving   
   unbounded questions! You know a thing like this will go crazy   
   if you do that". And silly me, one of my thoughts on what   
   the machine would have, is "canned intro answers for noobs",   
   sort of like a user manual that says not to take it into   
   the bath with you. So I ask the machine:   
      
      What are your capabilities ?   
      
   I was expecting an answer such as "I summarize text", "I have a   
   primitive image drawing module for artwork", "I can do OCR if   
   you give me an image" and so on.   
      
   Instead, I got yards and yards of text until the limit timer   
   went off... and it erased all the text on the screen.   
      
   So this teaches you, in terms of computer languages that   
   have a "workspace" concept, like BASIC and APL, that as soon   
   as you step into the machine, you are "in the workspace". The   
   guard rails are gone. There is no user manual in there. It seems   
   to have the ability to tell the difference between "continuation   
   of previous chain" versus "new question". I expect the human   
   is providing enough hints for the machine to figure that out.   
      
   It doesn't apply a framework to anything it is doing. For example,   
   I don't see in your three questions, any reference at all by the   
   AI, as to what version of FFMPEG supports a certain parameter format.   
      
   It's interesting, that for you, the machine realizes it needs to   
   "gather samples and run them for its very self". Yet, if it   
   did that, I would expect there would be a token overflow. Even in the   
   data center, it has a 128K or 256K token limit (a token is less than   
   a word). On the DeepSeek distilled models, the limit is something   
   like 4K tokens. And the Excel spreadsheet joke someone released,   
   that can accept about seven words of input or so. I would consider   
   a model to be "sufficiently capable", if you could give it the   
   URL of the Firefox tarball, and tell the machine to "rewrite that code".   
   Which is hundreds of megabytes of material :-)   
      
   summary: Personal opinion, I don't think a conversational style is appropriate.   
            For each question, open a copy of Notepad, provide a good description   
            of inputs, the one-liner question you've got, and then any   
   constraints.   
            The constraints don't mean anything, and will quite likely be ignored.   
            "Work slowly and step by step." Meaningless stuff like that. It's   
            already got one of those in the prompt, at a guess. Then copy your   
      
   [continued in next message]   
      
   --- SoupGate-DOS v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)
[ << oldest | < older | list | newer > | newest >> ]