April was the month for large language models. There was one announcement after another; most new models were larger than the previous ones, several claimed to be significantly more energy efficient. The largest (as far as we know) is Google’s GLAM, with 1.2 trillion parameters–but requiring significantly less energy to train than GPT-3. Chinchilla has ¼ as many parameters as GPT-3, but claims to outperform it. It’s not clear where the race to bigger and bigger models will end, or where it will lead us. The PaLM model claims to be able to reason about cause and effect (in addition to being more efficient than other large models); we don’t yet have thinking machines (and we may never), but we’re getting closer. It’s also good to see that energy efficiency has become part of the conversation.
AI
- Google has created GLAM a 1.2 trillion parameter model (7 times the size of GPT-3). Training GLAM required 456 megawatt-hours, ⅓ the energy of GPT-3. GLAM uses a Mixture-of-Experts (MoE) model, in which different subsets of the neural network are used, depending on the input.
- Google has released a dataset of 3D-scanned household items. This will be invaluable for anyone working on AI for virtual reality.
- FOMO (Faster Objects, More Objects) is a machine learning model for object detection in real time that requires less than 200KB of memory. It’s part of the TinyML movement: machine learning for small embedded systems.
- LAION (Large Scale Artificial Intelligence Open Network) is a non-profit, free, and open organization that is creating large models and making them available to the public. It’s what OpenAI was supposed to be. The first model is a set of image-text pairs for training models similar to DALL-E.
- NVidia is using AI to automate the design of their latest GPU chips.
- Using AI to inspect sewer pipes is one example of an “unseen” AI application. It’s infrastructural, it doesn’t risk incorporating biases or significant ethical problems, and (if it works) it improves the quality of human life.
- Large language models are generally based on text. Facebook is working on building a language model from spoken language, which is a much more difficult problem.
- STEGO is a new algorithm for automatically labeling image data. It uses transformers to understand relationships between objects, allowing it to segment and label objects without human input.
- A researcher has developed a model for predicting first impressions and stereotypes, based on a photograph. They’re careful to say that this model could easily be used to fine-tune fakes for maximum impact, and that “first impressions” don’t actually say anything about a person.
- A group building language models for the Maori people shows that AI for indigenous languages require different ways of thinking about artificial intelligence, data, and data rights.
- A21 is a new company offering a large language model “as a service.” They allow customers to train custom versions of their model, and they claim to make humans and machines “thought partners.”
- Researchers have found a method for reducing toxic text generated by language models. It sounds like a GAN (generative adversarial network), in which a model trained to produce toxic text “plays against” a model being trained to detect and reject toxicity.
- More bad applications of AI: companies are using AI to monitor your mood during sales calls. This questionable feature will soon be coming to Zoom.
- Primer has developed a tool that uses AI to transcribe, translate, and analyze intercepted communications in the war between Russia and Ukraine.
- Deep Mind claims that another new large language model, Chinchilla, outperforms GPT-3 and Gopher with roughly ¼th the number of parameters. It was trained on roughly 4 times as much data, but with fewer parameters, it requires less energy to train and fine-tune.
- Data Reliability Engineering (DRE) borrows ideas from SRE and DevOps as a framework to provide higher-quality data for machine learning applications while reducing the manual labor required. It’s closely related to data-centric AI.
- OpenAI’s DALL-E 2 is a new take on their system (DALL-E) for generating images from natural language descriptions. It is also capable of modifying existing artworks based on natural language descriptions of the modifications. OpenAI plans to open DALL-E 2 to the public, on terms similar to GPT-3.
- Google’s new Pathways Language Model (PaLM) is more efficient, can understand concepts, and reason about cause and effect, in addition to being relatively energy-efficient. It’s another step forward towards AI that actually appears to think.
- SandboxAQ is an Alphabet startup that is using AI to build technologies needed for a post-quantum world. They’re not doing quantum computing as such, but solving problems such as protocols for post-quantum cryptography.
- IBM has open sourced the Generative Toolkit for Scientific Discovery (GT4SD), which is a generative model designed to produce new ideas for scientific research, both in machine learning and in areas like biology and materials science.
- Waymo (Alphabet’s self-driving car company) now offers driverless service in San Francisco. San Francisco is a more challenging environment than Phoenix, where Waymo has offered driverless service since 2020. Participation is limited to members of their Trusted Tester program.
Web3
- Mastodon, a decentralized social network, appears to be benefitting from Elon Musk’s takeover of Twitter.
- Reputation and identity management for web3 is a significant problem: how do you verify identity and reputation without giving applications more information than they should have? A startup called Ontology claims to have solved it.
- A virtual art museum for NFTs is still under construction, but it exists, and you can visit it. It’s probably a better experience in VR.
- 2022 promises to be an even bigger year for cryptocrime than 2021. Attacks are increasingly focused on decentralized finance (DeFi) platforms.
- Could a web3 version of Wikipedia evade Russia’s demands that they remove “prohibited information”? Or will it lead to a Wikipedia that’s distorted by economic incentives (like past attempts to build a blockchain-based encyclopedia)?
- The Helium Network is a decentralized public wide area network using LoRaWAN that pays access point operators in cryptocurrency. The network has over 700,000 hotspots, and coverage in most of the world’s major metropolitan areas.
Programming
- Do we really need another shell scripting language? The developers of hush think we do. Hush is based on Lua, and claims to make shell scripting more robust and maintainable.
- Web Assembly is making inroads; here’s a list of startups using wasm for everything from client-side media editing to building serverless platforms, smart data pipelines, and other server-side infrastructure.
- QR codes are awful. Are they less awful when they’re animated? It doesn’t sound like it should work, but playing games with the error correction built into the standard allows the construction of animated QR codes.
- Build your own quantum computer (in simulation)? The Qubit Game lets players “build” a quantum computer, starting with a single qubit.
- One of Docker’s founders is developing a new product, Dagger, that will help developers manage DevOps pipelines.
- Can applications use “ambient notifications” (like a breeze, a gentle tap, or a shift in shadows) rather than intrusive beeps and gongs? Google has published Little Signals, six experiments with ambient notifications that includes code, electronics, and 3D models for hardware.
- Lambda Function URLs automate the configuration of an API endpoint for single-function microservices on AWS. They make the process of mapping a URL to a serverless function simple.
- GitHub has added a dependency review feature that inspects the consequences of a pull request and warns of vulnerabilities that were introduced by new dependencies.
- Google has proposed Supply Chain Levels for Software Artifacts (SLSA) as a framework for ensuring the integrity of the software supply chain. It is a set of security guidelines that can be used to generate metadata; the metadata can be audited and tracked to ensure that software components have not been tampered with and have traceable provenance.
- Harvard and the Linux Foundation have produced Census II, which lists thousands of the most popular open source libraries and attempts to rank their usage.
Security
- The REvil ransomware has returned (maybe). Although there’s a lot of speculation, it isn’t yet clear what this means or who is behind it. Nevertheless, they appear to be looking for business partners.
- Attackers used stolen OAuth tokens to compromise GitHub and download data from a number of organizations, most notably npm.
- The NSA, Department of Energy, and other federal agencies have discovered a new malware toolkit named “pipedream” that is designed to disable power infrastructure. It’s adaptable to other critical infrastructure systems. It doesn’t appear to have been used yet.
- A Russian state-sponsored group known as Sandworm failed in an attempt to bring down the Ukraine’s power grid. They used new versions of Industroyer (for attacking industrial control systems) and Caddywiper (for cleaning up after the attack).
- Re-use of IP addresses by a cloud provider can lead to “cloud squatting,” where an organization that is assigned a previously used IP address receives data intended for the previous addressee. Address assignment has become highly dynamic; DNS wasn’t designed for that.
- Pete Warden wants to build a coalition of researchers that will discuss ways of verifying the privacy of devices that have cameras and microphones (not limited to phones).
- Cyber warfare on the home front: The FBI remotely accessed devices at some US companies to remove Russian botnet malware. The malware targets WatchGuard firewalls and Asus routers. The Cyclops Blink botnet was developed by the Russia-sponsored Sandworm group.
- Ransomware attacks have been seen that target Jupyter Notebooks on notebook servers where authentication has been disabled. There doesn’t appear to be a significant vulnerability in Jupyter itself; just don’t disable authentication!
- By using a version of differential privacy on video feeds, surveillance cameras can provide a limited kind of privacy. Users can ask questions about the image, but can’t identify individuals. (Whether anyone wants a surveillance camera with privacy features is another question.)
Biology and Neuroscience
- A brain-computer interface has allowed an ALS patient who was completely “locked in” to communicate with the outside world. Communication is slow, but it goes well beyond simple yes/no requests.
Hardware
- CAT scans aren’t just for radiology. Lumafield has produced a table-sized CT-scan machine that can be used in small shops and offices, with the image analysis done in their cloud.
- Boston Dynamics has a second robot on the market: Stretch, a box-handling robot designed to perform tasks like unloading trucks and shipping containers.
- A startup claims it has the ability to put thousands of single-molecule biosensors on a silicon chip that can be mass-produced. They intend to have a commercial product by the end of 2022.
Metaverse
Learn faster. Dig deeper. See farther.