undefined

points

[-]

I've found that to be accurate when asking it questions that require ~PhD level knowledge to answer. e.g. Gemini and ChatGPT both seem to be capable of answering questions I have as I work through a set of notes on algebraic geometry.

Its performance on riddles has always seemed mostly irrelevant to me. Want to know if models can program? Ask them to program, and give them access to a compiler (they can now).

Want to know if it can do PhD level questions? Ask it questions a PhD (or at least grad student) would ask it.

They also reflect the tone and knowledge of the user and question. Ask it about your cat's astrological sign and you get emojis and short sentences in list form. Ask it why large atoms are unstable and you get paragraphs with larger vocabulary. Use jargon and it becomes more of an expert. etc.

by NicuCalcea11 hours ago|

parent|

[-]

I don't know about algebraic geometry, but AI is absolutely terrible at communications and social sciences. I know because I can tell when my postgraduate students use it.

by ndriscoll10 hours ago|

parent|

[-]

Are you sure? What about when you use it? e.g. I suppose asking it to critique experimental design and analytical methodology, or identify potential confounders and future areas to explore, or help summarize nearby research, etc.

If you can tell when your students use it, presumably you mean they're just copying whatever, which just sounds like that student doesn't know what they're doing or is being lazy. That doesn't mean the model isn't capable; it means an incapable person won't know what they'd want to ask of it.

Additionally, even for similar prompts, my experience is that the models for professional use (e.g. gpt-codex) take on a much more professional tone and level of pragmatism (e.g. no sycophancy) than models for general consumer entertainment use (e.g. chatgpt).

by NicuCalcea10 hours ago|

parent|

[-]

> What about when you use it?

I use AI for coding, but not for anything involving writing text, it's just horrendous at it. It just spews verbose slop, devoid of meaning, original thought or nuanced critique.

> That doesn't mean the model isn't capable; it means an incapable person won't know what they'd want to ask of it.

So it's user error again then, eh? PhD experts are able to help even "incapable" students, that's often a big part of their job.