AI for Students: Understanding AI

Understanding How AI Works

Learn the fundamentals of large language models to use them more effectively and understand their capabilities and limitations.

Why Understanding AI Matters for Students

Understanding how AI tools work helps you use them more effectively, recognise their limitations, and make better decisions about when and how to apply them to your studies. You do not need to be a computer scientist to benefit from this knowledge.

What Is a Large Language Model?

Large language models (LLMs) like ChatGPT, Claude, and Gemini are AI systems trained to understand and generate human language. Think of them as sophisticated pattern-recognition systems that have learnt from billions of examples of written text.

These models do not "think" or "understand" in the way humans do. Instead, they predict what words are most likely to come next based on patterns they have learnt during training. When you ask a question, the model generates a response by predicting the most appropriate sequence of words, one token at a time.

🧠 What They Are

Pattern-matching systems trained on vast amounts of text
Statistical models that predict likely word sequences
Tools that can process and generate human-like text
Systems that learn relationships between concepts and words

❌ What They Are Not

Not conscious beings with understanding or awareness
Not databases that look up factual information
Not connected to the internet in real-time (unless explicitly enabled)
Not infallible sources of truth

How LLMs Are Trained

Stage 1: Pre-Training

The model is trained on massive datasets of text from books, websites, academic papers, and other sources. During this phase, it learns patterns, relationships between words, grammar, facts about the world, and reasoning patterns.

Key Point: The model learns from examples in its training data, not from a structured database of facts. This is why it can sometimes "know" information but present it incorrectly.

Stage 2: Fine-Tuning

After pre-training, models are further trained to be more helpful, accurate, and safe. This involves reinforcement learning from human feedback (RLHF), where human reviewers rate different responses, teaching the model to prefer certain types of answers.

Why This Matters: This is why AI assistants tend to be helpful and detailed rather than giving one-word answers, even though technically both might be "correct".

Key Concepts Every Student Should Understand

📝 Tokens

AI models break text into small units called tokens. A token might be a word, part of a word, or punctuation. The model processes and generates text one token at a time.

Student Example: "Understanding" might be split into "Under", "stand", "ing". Most models process about 750 words per 1,000 tokens.

🧵 Context Window

The context window is the amount of text the model can "remember" at once. Everything you have written in a conversation, plus the model's responses, counts towards this limit.

Practical Impact: If you have a very long conversation or upload large documents, the model might "forget" earlier parts. Context windows range from about 8,000 to 200,000+ tokens depending on the model.

🎲 Temperature

Temperature controls how "creative" or "random" the model's responses are. Lower temperature (0-0.3) makes responses more focused and deterministic. Higher temperature (0.7-1.0) makes responses more varied and creative.

When to Care: For factual questions or academic work, lower temperature is better. For brainstorming or creative writing, higher temperature can be useful.

🔮 Probabilistic Generation

The model generates responses by predicting the most probable next word based on the input and its training. This means it might give slightly different answers to the same question asked multiple times.

Why This Matters: Two students asking the same question might get slightly different but equally valid answers. The model is not looking up a fixed answer.

How AI Generates Responses

When you send a message to an AI, here is what happens:

Input Processing

Your message is broken into tokens and converted into numbers (embeddings) that the model can process.

Pattern Matching

The model analyses patterns in your input and relates them to patterns it learnt during training.

Token Prediction

The model predicts the most likely next token, considering the entire conversation context and what would be most helpful.

Iterative Generation

This process repeats, with each new token influencing what comes next, until the model determines the response is complete.

⚠️ What This Means for Your Studies

Always verify important facts: The model generates plausible-sounding text, not guaranteed truth.
Understand that AI "hallucinates": Sometimes it confidently states false information because it is predicting likely-sounding text, not retrieving facts.
Be specific in your prompts: Better input leads to better output because it helps the model identify the most relevant patterns.
Use AI as a tool, not an authority: Think critically about responses and use your own judgement.
Be aware of the knowledge cut-off: Models are trained on data up to a certain date and cannot know about events after that unless they have web search capabilities.

Next: Limitations & Best Practices → ← Back to Home