Forums before death by AOL, social media and spammers... "We can't have nice things"
|    alt.comp.freeware    |    Generic free software discussions    |    39,988 messages    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
|    Message 38,281 of 39,988    |
|    Paul to Marion    |
|    Re: Attached is a "conversation" I had w    |
|    09 Feb 25 23:46:18    |
      XPost: alt.comp.os.windows-10, rec.photo.digital       From: nospam@needed.invalid              On Sun, 2/9/2025 6:20 PM, Marion wrote:       > Don't read this line by line... but you might want to skim it quickly.       >       > I'm new to AI where I realized AI can help me figure out what ffmpeg       > commands to use when I need to slightly modify videos for posting.       >       > Normally I ask here - and Paul gives me the answer! :)       > But today, I shunned Paul in favor of my (new) good friend, Mr. AI!       >       > The transcript below shows how AI usually gives the wrong answer at       > first but you can hone that answer, little by little, to solve issues.       >       > Here's what happened:       > a. I needed to upload a video to Amazon Vine that was in two parts       > b. So all I needed to do was concatenate two short videos I took       > (same camera, same everything)       > c. But the second video kept being rotated upside down (still is!)       >       > In desperation, I asked for AI to help solve the problem...       > Q: Hey AI. What is the Windows ffmpeg command to rotate a video 180       degrees clockwise              ... snip session, to keep the response a bit shorter              > I know this is frustrating, but I'm confident we can find the cause.       > Please provide the information requested above, and we'll get to the       > bottom of this.              It's up to you, to decide what conversational format you use with the AI.              To start with, your question can start with a description of inputs,       the question, and a series of constraint lines.              Some of the properties we know of, is the models have limits on tokens,       and seemingly easily forget things while doing a symbolic manipulation.              One of the reasons you got as far as you did, is the quality assurance       stage likely kept cutting in and forcing it to go back and refine the       question. For each of your thirteen questions.              On the subject matter, FFMPEG is a garbage in garbage out tool :-)       I think you knew that before starting this exercise, is that       FFMPEG is hit or miss on things. Or, at least, the ability of a human       to guess exactly what parameters switched in, would give the response       you wanted.              The AI is probably right, that it needs the refinement of you       providing all available metadata from each file, to correct       what is happening.              For example, one way of doing that, would be to say "each video segment       was shot on an iPhone7 in RAW mode using the 20Mpixel front camera.       While holding the camera in tall formation, rather than wide formation".              The AI could then better guess at what metadata had been injected       into the video, by the camera. The AI would also have a better idea       that an iPhone shoots in High Profile and so on. This would reduce       some of the stabs in the dark it is making.              But my experience (not really a lot of questions) with the AI,       is you're damned if you do and damned if you don't. If you try to       "lead" the AI, by perhaps including the wrong kind of parameter       or construction as part of your input, the stupid thing will try       and make an answer that *includes* your guess. This is bad. On the       one hand, we don't want to pollute the problem space with       unnecessary observations. We do want to provide enough color       commentary, so it can guess what is wrong better.              If we were to grab three Youtube videos and try and splice them       together, there's no guarantee they have all been reduced to some       clean baseline before we get them. Whereas if the AI knows all       three were shot with the same (named) camera, it will at least       know for example, that the camera automatically includes the       rotation metadata, as a function of how you held the camera and so on.              But just again as a general comment, I expect every session with       an AI to go like this. It very much depends on "your own intelligence",       to turn the "story summarizer" into a "problem solver". It's not AGI,       it's not even remotely close to AGI.              Similar to a USENET thread, you'll notice how threads go to hell       as a function of missing details. The participants here are better       at guessing some things, but they will flounder (and sub-threads result),       when the answer is looking too broad.              My very first question of the Ai, illustrates this. As I was       sitting at the machine, I said to myself "got to avoid giving       unbounded questions! You know a thing like this will go crazy       if you do that". And silly me, one of my thoughts on what       the machine would have, is "canned intro answers for noobs",       sort of like a user manual that says not to take it into       the bath with you. So I ask the machine:               What are your capabilities ?              I was expecting an answer such as "I summarize text", "I have a       primitive image drawing module for artwork", "I can do OCR if       you give me an image" and so on.              Instead, I got yards and yards of text until the limit timer       went off... and it erased all the text on the screen.              So this teaches you, in terms of computer languages that       have a "workspace" concept, like BASIC and APL, that as soon       as you step into the machine, you are "in the workspace". The       guard rails are gone. There is no user manual in there. It seems       to have the ability to tell the difference between "continuation       of previous chain" versus "new question". I expect the human       is providing enough hints for the machine to figure that out.              It doesn't apply a framework to anything it is doing. For example,       I don't see in your three questions, any reference at all by the       AI, as to what version of FFMPEG supports a certain parameter format.              It's interesting, that for you, the machine realizes it needs to       "gather samples and run them for its very self". Yet, if it       did that, I would expect there would be a token overflow. Even in the       data center, it has a 128K or 256K token limit (a token is less than       a word). On the DeepSeek distilled models, the limit is something       like 4K tokens. And the Excel spreadsheet joke someone released,       that can accept about seven words of input or so. I would consider       a model to be "sufficiently capable", if you could give it the       URL of the Firefox tarball, and tell the machine to "rewrite that code".       Which is hundreds of megabytes of material :-)              summary: Personal opinion, I don't think a conversational style is appropriate.        For each question, open a copy of Notepad, provide a good description        of inputs, the one-liner question you've got, and then any       constraints.        The constraints don't mean anything, and will quite likely be ignored.        "Work slowly and step by step." Meaningless stuff like that. It's        already got one of those in the prompt, at a guess. Then copy your              [continued in next message]              --- SoupGate-DOS v1.05        * Origin: you cannot sedate... all the things you hate (1:229/2)    |
[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]
(c) 1994, bbs@darkrealms.ca