Diaj.ai — The Age of AI

A language model is a statistical system trained to predict the probability of sequences of words. Modern large language models (LLMs) are neural networks — most commonly using the Transformer architecture — trained on massive text datasets to understand and generate human language.

Parameters

The learnable weights in a neural network. More parameters generally means more capacity to store knowledge and patterns. GPT-4 has ~1.76 trillion parameters. Parameters are adjusted during training to minimize prediction error.

Context window

The maximum amount of text a model can process at once (input + output). Measured in tokens. Claude Opus 4.6 supports 200K tokens (~150,000 words). Larger context = can reason over longer documents.

Tokens

The basic unit of text for LLMs. A token is roughly 4 characters or ¾ of a word in English. "Hello world" = 2 tokens. Models are priced per 1,000 tokens processed. Tokenization affects how well models handle different languages.

Temperature

A sampling parameter (0–2) controlling output randomness. Temperature = 0 gives deterministic, predictable outputs (best for code/facts). Temperature = 1 is standard creative writing. Temperature = 2 is highly random and often incoherent.

Hallucination

When a model generates confident-sounding but factually incorrect information. A key limitation of LLMs. Caused by the model predicting plausible-sounding continuations rather than retrieving verified facts. Reduced by RAG, grounding, and tool use.

Fine-tuning

Further training a pre-trained base model on a smaller domain-specific dataset to specialize its behavior. Examples: fine-tuning GPT-4 on legal briefs, or Llama on medical literature. Cheaper than training from scratch; requires far less data.

What makes something an AI agent?

An AI agent is a system that uses an LLM as its reasoning core, but can also take actions in the world — calling tools, browsing the web, writing and executing code, managing files, or sending messages. Unlike a chatbot (which only responds), an agent operates in a loop: observe → think → act → observe again.

ReAct loop

The most common agent pattern: Reasoning + Acting interleaved. The model thinks step-by-step (chain-of-thought), decides on an action (tool call), observes the result, then reasons again. Enables complex multi-step task completion.

Tool use / Function calling

The ability to call external APIs, run code, query databases, or browse the web. The model outputs a structured JSON call specifying which tool to use and with what parameters. Results are injected back into context.

Memory systems

How agents persist information: (1) In-context memory — conversation history in the current window. (2) External memory — vector databases for long-term recall. (3) Procedural memory — learned workflows. (4) Episodic — summaries of past sessions.

Multi-agent systems

Networks of AI agents that collaborate, delegate, and check each other's work. An orchestrator agent breaks down a task and assigns subtasks to specialist agents (coder, researcher, critic). Enables parallelism and specialization.