How does dIKta.me work offline?

dIKta.me runs Whisper V3 Turbo and local LLMs (Gemma 3, Llama 3) directly on your GPU. No audio or text data ever leaves your machine. It is 100% air-gapped by default.

What operating systems does dIKta.me support?

dIKta.me is available for Windows 10+ (x64). macOS and Linux support are on the roadmap.

How much does dIKta.me cost?

Free trial with cloud credits included. Full Version: $20 one-time purchase for unlimited local dictation, all voice modes, and lifetime updates. No subscription required.

What languages does dIKta.me support for speech recognition?

Whisper V3 Turbo supports 90+ languages with automatic language detection. Bidirectional English-Spanish translation is built-in.

Do I need an NVIDIA GPU to use dIKta.me?

An NVIDIA GPU is recommended for the fastest local STT and LLM processing. However, dIKta.me also works on CPU (slower) and offers a cloud mode with wallet credits for users without a powerful GPU.

The week that Machines Learned to Fall

The event of the week was a stress test dressed as a marathon. My grandfather, who never ran more than fifty meters in his life, used to say that a man is known by how he gets up. The line worked because it implied, without saying so, that falling was the default. Last Sunday, in the Yizhuang industrial zone southeast of Beijing, a red humanoid called Lightning — built by Honor, the Chinese phone company — finished the Beijing E-Town Half Marathon in fifty minutes and twenty-six seconds. Jacob Kiplimo's human world record is fifty-seven thirty-one. More than a hundred robots entered the race, about forty percent of them fully autonomous; the rest were remotely piloted. Several fell apart at the starting line. What was genuinely new — and here one imagines my grandfather nodding from his chair — is that the losers were also machines. The race crossed a threshold no one announced: it is no longer man against machine. It is machine against machine, and the human, if anything, runs alongside.

The second surprise arrived without a press kit. Four days earlier, Tencent — China's largest videogame company — published on Hugging Face a model called HY-World 2.0, or more specifically WorldMirror 2.0, about one-point-two billion parameters with an open commercial license. Translation: anyone with a computer can now turn a text prompt, a photograph, several photographs, or a video into a full three-dimensional world — meshes, Gaussian splats, point clouds — exportable to Unity, Unreal, or Blender, and sell it. A small studio in Oaxaca can do this afternoon what a year ago required a Google lab. There is commercial calculation behind the gesture: Tencent is betting the community will improve the code for free. But the effect is the same as when a new toll road opens without a toll: companies that weren't on the original blueprint start driving through.

The same weekend, Nvidia published Lyra 2.0, built on a fourteen-billion-parameter diffusion transformer. It takes a single photograph — four hundred eighty by eight hundred thirty-two pixels, smaller than the one you took yesterday — generates an eighty-one-frame walkthrough video, and from that walkthrough builds a three-dimensional world you can fly through in real time. A picture of your living room, and suddenly you are inside it, looking behind the sofa. The catch, because there is always a catch, is that Nvidia released it for research use only. No product. No commercial deployment. In Mexico we say they lend you the car but not the keys. The reason is transparent: Nvidia sells Omniverse, its corporate platform for virtual worlds, and cannibalizing it would be imprudent. So the miracle exists, is online, and is simultaneously unavailable. Capitalism teaching a master class in Maya archaeology.

Across the sea in Tokyo, Sakana AI published something called Digital Ecosystems. It is, literally, a browser game. You go to pub.sakana.ai/digital-ecosystem and find neural cellular automata where digital species compete and cooperate on a shared grid. You draw walls, seed species, tune parameters, and watch the system stabilize in what specialists call — solemnly — the excitable edge of chaos. There is no product. There is no business model. Llion Jones, Sakana's co-founder and one of the eight original authors of the 2017 paper that invented Transformers, is the kind of researcher who builds things to understand something, not to sell it. In a week where every AI announcement arrives with a funding round attached, publishing a toy without an invoice is almost a political provocation.

And at the end, the quietest story and perhaps the most important. Lucy Shi, a researcher at Physical Intelligence and a Stanford PhD student, presented π0.7 — a generalist robot model that folded laundry on a UR5e arm it had never seen during training, matching expert human teleoperators on the first attempt. Then it did something stranger with an air fryer. The model began at a five percent success rate opening the fryer, placing something inside, and closing it. Shi spent thirty minutes refining the natural-language instruction — not retraining the model, only rewording the sentence — until the rate reached ninety-five percent. The robot did not learn a new task. It learned to be better directed, the way a bright student is helped by a clearer question. Engineers call this emergent compositional generalization. My grandfather would have called it knowing how to listen.