16 January 2020

Reformer: The Efficient Transformer




Understanding sequential data — such as language, music or videos — is a challenging task, especially when there is dependence on extensive surrounding context. For example, if a person or an object disappears from view in a video only to re-appear much later, many models will forget how it looked. In the language domain, long short-term memory (LSTM) neural networks cover enough context to translate sentence-by-sentence. In this case, the context window (i.e., the span of data taken into consideration in the translation) covers from dozens to about a hundred words. The more recent Transformer model not only improved performance in sentence-by-sentence translation, but could be used to generate entire Wikipedia articles through multi-document summarization. This is possible because the context window used by Transformer extends to thousands of words. With such a large context window, Transformer could be used for applications beyond text, including pixels or musical notes, enabling it to be used to generate music and images.

However, extending Transformer to even larger context windows runs into limitations. The power of Transformer comes from attention, the process by which it considers all possible pairs of words within the context window to understand the connections between them. So, in the case of a text of 100K words, this would require assessment of 100K x 100K word pairs, or 10 billion pairs for each step, which is impractical. Another problem is with the standard practice of storing the output of each model layer. For applications using large context windows, the memory requirement for storing the output of multiple model layers quickly becomes prohibitively large (from gigabytes with a few layers to terabytes in models with thousands of layers). This means that realistic Transformer models, using numerous layers, can only be used on a few paragraphs of text or generate short pieces of music.

Today, we introduce the Reformer, a Transformer model designed to handle context windows of up to 1 million words, all on a single accelerator and using only 16GB of memory. It combines two crucial techniques to solve the problems of attention and memory allocation that limit Transformer’s application to long context windows. Reformer uses locality-sensitive-hashing (LSH) to reduce the complexity of attending over long sequences and reversible residual layers to more efficiently use the memory available.

The Attention Problem
The first challenge when applying a Transformer model to a very large text sequence is how to handle the attention layer. LSH accomplishes this by computing a hash function that matches similar vectors together, instead of searching through all possible pairs of vectors. For example, in a translation task, where each vector from the first layer of the network represents a word (even larger contexts in subsequent layers), vectors corresponding to the same words in different languages may get the same hash. In the figure below, different colors depict different hashes, with similar words having the same color. When the hashes are assigned, the sequence is rearranged to bring elements with the same hash together and divided into segments (or chunks) to enable parallel processing. Attention is then applied within these much shorter chunks (and their adjoining neighbors to cover the overflow), greatly reducing the computational load.
Locality-sensitive-hashing: Reformer takes in an input sequence of keys, where each key is a vector representing individual words (or pixels, in the case of images) in the first layer and larger contexts in subsequent layers. LSH is applied to the sequence, after which the keys are sorted by their hash and chunked. Attention is applied only within a single chunk and its immediate neighbors.
The Memory Problem
While LSH solves the problem with attention, there is still a memory issue. A single layer of a network often requires up to a few GB of memory and usually fits on a single GPU, so even a model with long sequences could be executed if it only had one layer. But when training a multi-layer model with gradient descent, activations from each layer need to be saved for use in the backward pass. A typical Transformer model has a dozen or more layers, so memory quickly runs out if used to cache values from each of those layers.

The second novel approach implemented in Reformer is to recompute the input of each layer on-demand during back-propagation, rather than storing it in memory. This is accomplished by using reversible layers, where activations from the last layer of the network are used to recover activations from any intermediate layer, by what amounts to running the network in reverse. In a typical residual network, each layer in the stack keeps adding to vectors that pass through the network. Reversible layers, instead, have two sets of activations for each layer. One follows the standard procedure just described and is progressively updated from one layer to the next, but the other captures only the changes to the first. Thus, to run the network in reverse, one simply subtracts the activations applied at each layer.
Reversible layers: (A) In a standard residual network, the activations from each layer are used to update the inputs into the next layer. (B) In a reversible network, two sets of activations are maintained, only one of which is updated after each layer. (C) This approach enables running the network in reverse in order to recover all intermediate values.
Applications of Reformer
The novel application of these two approaches in Reformer makes it highly efficient, enabling it to process text sequences of lengths up to 1 million words on a single accelerator using only 16GB of memory. Since Reformer has such high efficiency, it can be applied directly to data with context windows much larger than virtually all current state-of-the-art text domain datasets. Perhaps Reformer’s ability to deal with such large datasets will stimulate the community to create them.

One area where there is no shortage of large-context data is image generation, so we experiment with the Reformer on images. In this colab, we present examples of how Reformer can be used to “complete” partial images. Starting with the image fragments shown in the top row of the figure below, Reformer can generate full frame images (bottom row), pixel-by-pixel.
Top: Image fragments used as input to Reformer. Bottom: “Completed” full-frame images. Original images are from the Imagenet64 dataset.
While the application of Reformer to imaging and video tasks shows great potential, its application to text is even more exciting. Reformer can process entire novels, all at once and on a single device. Processing the entirety of Crime and Punishment in a single training example is demonstrated in this colab. In the future, when there are more datasets with long-form text to train, techniques such as the Reformer may make it possible to generate long coherent compositions.

Conclusion
We believe Reformer gives the basis for future use of Transformer models, both for long text and applications outside of natural language processing. Following our tradition of doing research in the open, we have already started exploring how to apply it to even longer sequences and how to improve handling of positional encodings. Read the Reformer paper (selected for oral presentation at ICLR 2020), explore our code and develop your own ideas too. Few long-context datasets are widely used in deep learning yet, but in the real world long context is everywhere. Maybe you can find a new application for Reformer — start with this colab and chat with us if you have any problems or questions!

Acknowledgements
This research was conducted by Nikita Kitaev, Łukasz Kaiser and Anselm Levskaya. Additional thanks go to Afroz Mohiuddin, Jonni Kanerva and Piotr Kozakowski for their work on Trax and to the whole JAX team for their support.

44% of TikTok’s all-time downloads were in 2019, but app hasn’t figured out monetization


Despite the U.S. government’s concerns over TikTok which most recently led to the U.S. Navy banning service members’ use of the app, TikTok had a stellar 2019 in terms of both downloads and revenue. According to new data from Sensor Tower, 44% of TikTok’s total 1.65 billion downloads to date, or 738+ million installs, took place in 2019 alone. And though TikTok is still just experimenting with different means of monetization, the app had its best year in terms of revenue, grossing $176.9 million in 2019 — or 71% of its all-time revenue of $247.6 million.

Apptopia had previously reported TikTok was generating $50 million per quarter.

The number of TikTok downloads in 2019 is up 13% from the 655 million installs the app saw in 2018, with the holiday quarter (Q4 2019) being TikTok’s best ever with 219 million downloads, up 6% from TikTok’s previous best quarter, Q4 2018. TikTok was also the second-most downloaded (non-game) app worldwide across the Apple App Store and Google Play in 2019, according to Sensor Tower data.

However, App Annie’s recent “State of Mobile” report put it in fourth place, behind Messenger, Facebook, and WhatsApp — not just behind WhatsApp, as Sensor Tower does.

Regardless, the increase in TikTok downloads in 2019 is largely tied to the app’s traction in India. Though the app was briefly banned in the country earlier in the year, that market still accounted for 44% (or 323M) of 2019’s total downloads. That’s a 27% increase from 2018.

TikTok’s home country, China, is TikTok’s biggest revenue driver, with iOS consumer spend of $122.9 million, or 69% of the total and more than triple what U.S. users spent in the app ($36M). The U.K. was the third-largest contributor in terms of revenue, with users spending $4.2 million in 2019.

These numbers, however, are minuscule in comparison with the billions upon billions earned by Facebook on an annual basis, or even the low-digit billions earned by smaller social apps like Twitter. To be fair, TikTok remains in an experimental phase with regards to revenue. In 2019, it ran a variety of ad formats including brand takeovers, in-feed native video, hashtag challenges, and lens filters. It even dabbled in social commerce.

Meanwhile, only a handful of creators have been able to earn money in live streams through tipping — another area that deserves to see expansion in the months ahead, if TikTok aims to take on YouTube as a home for creator talent.

When it comes to monetization, TikTok is challenged because it doesn’t have as much personal information about its users, compared with a network like Facebook and its rich user profile data. That means advertisers can’t target ads based on user interests and demographics in the same way. Because of this, brands will sometimes forgo working with TikTok itself to deal directly with its influencer stars, instead.

What TikTok lacks in revenue, it makes up for in user engagement. According to App Annie, time spent in the app was up 210% year-over-year in 2019,  to reach a total 68 billion hours. TikTok clearly has users’ attention, but now it will need to figure out how to capitalize on those eyeballs and actually make money.

Reached for comment, TikTok confirmed it doesn’t share its own stats on installs or revenue, so third-party estimates are the only way to track the app’s growth for now.


Read Full Article

Chrome gets global media controls


Here is a small but useful new feature in Google Chrome: global media controls that allow you to control all of the audio and video sources in your current tabs from a single widget. With this, you can switch to the next song from your favorite web-based music streaming service, start and stop a YouTube video that’s playing the background or switch back and forth between what’s playing in multiple tabs without having to hunt around your browser for the right tab. It’s not going to rock your world, but it’s a useful new feature.

Google started these media controls last year when it enabled it for Chromebook users, but it’s now live in the stable channel for all Chrome users across desktop platforms.

This seems to work with as many media tabs as you can handle, though from what I have seen, Google’s own services like YouTube and YouTube Music tend to get more extensive control options with thumbnails while Spotify only showed three controls to go back, skip to the next song and pause.

To give it a try. Simply play media in any of your tabs and look for the new media control icon to pop up to the right of the URL field.

It’s worth noting that the new Chromium-based Microsoft Edge, which came out of preview yesterday, features the exact same media controls (down to the icon) in its pre-release channels, though they haven’t made it into the stable release yet. Firefox does not currently have a similar built-in feature.


Read Full Article

Venture Highway announces $78.6M second fund to invest in early-stage startups in India


Venture Highway, a VC firm in India founded by former Google executive Samir Sood, said on Thursday it has raised $78.6 million for its second fund as it looks to double down on investing in early-stage startups.

The firm, founded in 2015, has invested in more than two dozen startups to date, including social network ShareChat, which last year raised $100 million in a financing round led by Twitter; social commerce Meesho, which has since grown to be backed by Facebook and Prosus Ventures; and Lightspeed-backed OkCredit, which provides a bookkeeping app for small merchants.

Moving forward, Venture Highway aims to lead pre-seed and seed financing rounds and cut checks between $1 million to $1.5 million on each investment (up from its earlier investment range of $100,000 to $1 million), said Sood in an interview with TechCrunch.

Venture Highway counts Neeraj Arora, former business head of WhatsApp who played an instrumental role in selling the messaging app to Facebook, as a founding “anchor of LPs” and advisor. Arora and Sood worked together at Google more than a decade ago and helped the Silicon Valley giant explore merger and acquisition deals in Asia and other regions.

Samir Sood, the founder of Venture Highway

The VC firm said it has already made a number of investments through its second fund. Some of those deals include investments in OkCredit, mobile esports platform MPL, Gurgaon-based supply chain SaaS platform O4S, social commerce startup WMall, online rental platform CityFurnish, community platform MyScoot and online gasoline delivery platform MyPetrolPump.

As apparent from the aforementioned names, Venture Highway focuses on investing in startups that are using technology to address problems that have not been previously tackled.

Last year Venture Highway also participated in a funding round of Marsplay, a New Delhi-based startup that operates a social app where influencers showcase beauty and apparel content to sell to consumers.

“It’s very rare to have investors who keep their calm, get into an entrepreneurial mindset and help founders achieve their dreams. Throughout the journey, Venture Highway has been extremely helpful, emotionally available (super important to founders) and very resourceful,” said Misbah Ashraf, 26-year-old co-founder and chief executive of Marsplay, in an interview with TechCrunch.

There is no “theme” or category that Venture Highway is particularly interested in, said Sood. “As long as there is a tech layer; and the startup is doing something where we or our network of LPs, advisors and investors can add value, we are open to discussions,” he said.

This is the first time Venture Highway has raised money from LPs. The firm’s first fund was bankrolled by Sood and Arora.

Dozens of local and international VC funds are today active in India, where startups raised a record $14.5 billion last year. But a significant number of them focus on late-stage deals.


Read Full Article

Why are drug prices so high? Investigating the outdated US patent system | Priti Krishtel

Why are drug prices so high? Investigating the outdated US patent system | Priti Krishtel

Between 2006 and 2016, the number of drug patents granted in the United States doubled -- but not because there was an explosion in invention or innovation. Drug companies have learned how to game the system, accumulating patents not for new medicines but for small changes to existing ones, which allows them to build monopolies, block competition and drive prices up. Health justice lawyer Priti Krishtel sheds light on how we've lost sight of the patent system's original intent -- and offers five reforms for a redesign that would serve the public and save lives.

Click the above link to download the TED talk.

Wipro Ventures announces $150M Fund II to invest in enterprise startups


Wipro Ventures, the investment arm of one of India’s largest IT companies by market capitalization, said on Thursday it has raised $150 million for its second fund as it looks to invest in more enterprise startups and venture capitalist funds.

As with its $100 million maiden fund in 2015, Wipro Ventures will use its second fund to invest in early and mid-stage startups worldwide that are building enterprise solutions in cybersecurity, analytics, cloud infrastructure, test automation, and AI, said Biplab Adhya, in an interview with TechCrunch.

Through its maiden fund, Wipro Ventures invested in 16 startups and five VC funds. Adhya said two of its portfolio startups — including Demisto, which sold to Palo Alto for $560 million — have seen an exit while others are showing good signs.

“We are pleased with the traction these startups are showing and the value we have added to Wipro, and we look forward to continuing this journey,” he said.

Adhya said Wipro Ventures looks to be a long-term investor in a startup. In addition to often participating in a startup’s follow-on financial rounds, it tends to stay with a startup until its IPO, he said.

Of the 16 startups Wipro Ventures has invested to date, 11 of them are based in the U.S., four in Israel, and one in India. Adhya said geography tends not to play a crucial role when investing in a startup, and he is open to ideas from anywhere in the world.

A corporate giant showing interest in picking stake in private equity firms is not a new phenomenon. Leaving aside the American giants such as Google, Microsoft, and Facebook, all of which operate investment arms, Indian IT giants have also been at it for years.

HCL and Infosys, two other IT giants in India, have also invested in — or outright acquired — dozens of startups in recent years. A 2017 CB Insights report showed that Wipro and Infosys, which runs Innovation Fund, alone had invested in 28 firms and acquired eight startups.

Adhya said Wipro Ventures is now investing in six to eight startups each year.

One of the benefits of taking money from a corporate giant is sometimes getting access to their other customers. And that appears to be true of Wipro. More than 100 of Wipro’s global customers have deployed solutions from its portfolio startups, Adhya said.

In a statement, Rishi Bhargava, a founder of Demisto, explained the benefit. “Within the first year of our partnership, Wipro and Demisto were working together on dozens of Fortune 1000 opportunities and closing a majority of them.

“It’s exciting to see Wipro Ventures continue to enhance the startup ecosystem with new capital while helping companies boost their bottom line,” he added.


Read Full Article

Learn How to Drive Traffic and Sales For Your Brand on Facebook With This $29 Training Bundle


Although there are now many pretenders to the social throne, Facebook is still the king. The platform is so powerful that many people run successful businesses through pages and groups, without a separate business website. The Ultimate Facebook Marketing Certification Bundle shows you how to gain traction with any brand, with 30 hours of hands-on training. You can get the bundle now for only $29 at MakeUseOf Deals.

Social Media Marketing

What makes Facebook so useful for marketers is that you can target specific audiences. Say you want to offer business coaching to people in Oakland — with Facebook, it is possible to target young professionals interested in entrepreneurship who live in the city.

This training bundle shows you how, with seven in-depth courses looking at various forms of Facebook marketing. Through video lessons, you discover how to build your own page, create a sales funnel and set up retargeting. You also learn how to use Facebook Ads and livestreaming to expand your audience.

Each course offers simple, actionable advice that is broken down into concise tutorials. You should come away with the skills to build a brand or even offer your services as a Facebook marketing expert.

Seven Courses for $29

These courses are worth $1,400 in total, but you can get the bundle now for only $29 with lifetime access included.

Read the full article: Learn How to Drive Traffic and Sales For Your Brand on Facebook With This $29 Training Bundle


Read Full Article

Gateron’s Mechanical Switches for 2020: Waterproof, Magnetic, and Low-Profile


gateron 2020 switches displayed on a table

Gateron, a top-tier manufacturer of mechanical switches for keyboards, announced their newest and greatest products for 2020: a range of colored switches which include magnetic, low-profile, waterproof, tactile, linear, and clicky options. Most are simply upgrades of their 2019 models but a handful are entirely new technologies.

Gateron provided me with a grab-bag of their old and new switch options. This very brief wrap up will cover Gateron’s new switch options and how they compare to already existing switch options from their competitors at Cherry.

The 2020 Gateron Lineup

The current Gateron lineup includes magnetic, waterproof, and low-profile switches, which require special PCBs in order to work. In other words, magnetic, waterproof, and low-profile switch technologies are not compatible with the Cherry-MX standard which dominates today’s keyboard market. However, the magnetic and waterproof switches will work with Cherry-MX key-cap stems.

Gateron Low-Profile KS-12 Switches

gateron low profile red switch

Gateron’s newest switches include three special low-profile switches which come in Blue, Brown, and Red. As with Cherry’s color-nomenclature, they are as follows:

  • Brown: tactile, 2.5mm travel, 1.5mm actuation, 50g force
  • Blue: clicky, 2.5mm travel, 1.5mm actuation, 52g force
  • Red: linear, 2.5mm travel, 1.5mm actuation, 45g force

The new switches will compete against Kalih’s Chocolate line of low-profile switch and Cherry’s similarly slender competitors. To be honest, I cannot tell the difference between Kalih and Gateron’s options. While Gateron’s Cherry-clones generally feel smoother and less scratchy than products from Cherry and Kalih, its low profile options seem virtually identical to its competitors. But that’s not bad. The low-profile keys are rattle-free designs as they incorporate box housing for their key caps.

gateron low profile red side view

Gateron Magnetic KS-20 Switches

gateron magnetic switch displayed fro the bottom

Gateron’s magnetic switches (model number KS-20) will bring with them the deep 4mm of travel expected of desktop class mechanical switches along with extreme lightweight actuation force of 30 grams of weight. The magnetic switches also have less variation in their actuation force (+/- 10 grams instead of 15 grams).

Their key selling point is their massively enhanced durability versus standard mechanical switches: they last for double the amount of time of regular Cherry switches. Instead of 50 million presses, the KS-20 is rated for 100 million. However, I doubt anyone in history has reached the mechanical durability limits of 50 million key actuations so 100 million is superfluous.

gateron switch magnetic displayed from the side

Unfortunately, the KS-20 will require its own custom PCB. So you cannot retrofit older boards with the newer switch technology.

Gateron KS-12 Waterproof Switches

Unfortunately, Gateron did not show me their waterproof switch, which unfortunately requires its own specialized PCB. The switch contains four plastic stabilizers on its base, though, which means you cannot simply stick one on a random Cherry-MX compatible PCB. So if you wanted to spray a waterproof coating on your favorite PCB and throw some KS-12 switches on it, you’re out of luck.

Gateron 2019 Vs. Gateron 2020

Gateron’s first line of keyboard switches included more or less copies of Cherry’s color scheme with their own unique take. These models (KS-3, KS-8, and KS-9) included the following colors:

  • Blue: midweight, 55g actuation, clicky
  • Brown: lightweight: 45g actuation, tactile
  • Black: midweight, 50g actuation, tactile
  • Red: lightweight, 45g actuation, linear
  • Green: heavy, 80g actuation, clicky
  • Clear: lightweight, 35g actuation, linear

In 2019, Gateron offered an infrared switch, known as the KS-15. And it eventually added its Ink series, which was more or less colored, clear-bodied switch housings with slightly deeper actuation points. Overall, its 2020 lineup represents a more radical departure from its competitors and its 2019 offerings.

Read the full article: Gateron’s Mechanical Switches for 2020: Waterproof, Magnetic, and Low-Profile


Read Full Article

You Can Now Make Spotify Playlists for Pets


Spotify wants to help you make a music playlist for your pet. Which is every bit as silly as it sounds. However, with most of us owning at least one pet, we’re going to go ahead and guess that most of you reading this are intrigued enough to give it a go.

Unlock the FREE "Spotify Keyboard Shortcuts" cheat sheet now!

This will sign you up to our newsletter

Enter your Email

Spotify Wants to Make Playlists for Your Pets

According to Spotify, 71 percent of pet owners have played music for their pets, and 80 percent believe their pets like music. This is based on a survey of 5,000 music-streaming pet owners from the US, the UK, Australia, Spain, and Italy.

Upon discovering how mad most pet owners are, Spotify set about creating playlists for pets. So, as outlined on For the Record, Spotify “created a unique experience to help you craft the pawfect algorithmically generated playlist for you and your pet to enjoy together.”

How to Make a Spotify Playlist for Your Pet

  1. Open the Spotify for Pets microsite and sign into your Spotify account.
  2. Pick your pet (only dogs, cats, hamsters, birds, and iguanas are available).
  3. Use the slider to indicate your pet’s personality (from Relaxed to Energetic).
  4. Use the slider to indicate your pet’s personality (from Shy to Friendly).
  5. Use the slider to indicate your pet’s personality (from Apathetic to Curious).
  6. Type in the name of your pet and upload a photo of your pet (optional).
  7. Spotify will sync your pet’s personality with your tastes to create a playlist.

You’ll then be presented with a card revealing a selection of the artists included on the playlist. And you can either click “Listen Now” to open Spotify, or “Share what you just made for your best friend” with others via Facebook, Twitter, and Instagram.

Spotify’s playlists for pets are “algorithmically created” and “based on your listening habits and your pet’s attributes”. Spotify admits that “music for pets isn’t an exact science,” but assures us that they “consulted with experts in the pet industry”.

If you’re wondering why your goldfish, tarantula, or snake aren’t catered for, according to Spotify, it’s because they don’t have ears. Obviously. As for rabbits, aardvarks, and horses, Spotify has no excuse, but suggests trying a playlist meant for another animal.

How Did Your Pet Respond to Their Spotify Playlist?

Music playlists for pets are just a bit of fun, and not to be taken too seriously. Still, it would be interesting to hear how your pet reacted to Spotify’s choice of music for them. Unfortunately, virtual pets that live on your mobile don’t count. Yet.

Read the full article: You Can Now Make Spotify Playlists for Pets


Read Full Article

Apple buys edge-based AI startup Xnor.ai for a reported $200M


Xnor.ai, spun off in 2017 from the nonprofit Allen Institute for AI (AI2), has been acquired by Apple for about $200 million. A source close to the company corroborated a report this morning from GeekWire to that effect.

Apple confirmed the reports with its standard statement for this sort of quiet acquisition: “Apple buys smaller technology companies from time to time and we generally do not discuss our purpose or plans.” (I’ve asked for clarification just in case.)

Xnor.ai began as a process for making machine learning algorithms highly efficient — so efficient that they could run on even the lowest tier of hardware out there, things like embedded electronics in security cameras that use only a modicum of power. Yet using Xnor’s algorithms they could accomplish tasks like object recognition, which in other circumstances might require a powerful processor or connection to the cloud.

CEO Ali Farhadi and his founding team put the company together at AI2 and spun it out just before the organization formally launched its incubator program. It raised $2.7M in early 2017 and $12M in 2018, both rounds led by Seattle’s Madrona Venture Group, and has steadily grown its local operations and areas of business.

The $200M acquisition price is only approximate, the source indicated, but even if the final number were less by half that would be a big return for Madrona and other investors.

The company will likely move to Apple’s Seattle offices; GeekWire, visiting the Xnor.ai offices (in inclement weather, no less), reported that a move was clearly underway. AI2 confirmed that Farhadi is no longer working there, but he will retain his faculty position at the University of Washington.

An acquisition by Apple makes perfect sense when one thinks of how that company has been directing its efforts towards edge computing. With a chip dedicated to executing machine learning workflows in a variety of situations, Apple clearly intends for its devices to operate independent of the cloud for such tasks as facial recognition, natural language processing, and augmented reality. It’s as much for performance as privacy purposes.

Its camera software especially makes extensive use of machine learning algorithms for both capturing and processing images, a compute-heavy task that could potentially be made much lighter with the inclusion of Xnor’s economizing techniques. The future of photography is code, after all — so the more of it you can execute, and the less time and power it takes to do so, the better.

 

It could also indicate new forays in the smart home, toward which with HomePod Apple has made some tentative steps. But Xnor’s technology is highly adaptable and as such rather difficult to predict as far as what it enables for such a vast company as Apple.


Read Full Article

Apple buys edge-based AI startup Xnor.ai for a reported $200M


Xnor.ai, spun off in 2017 from the nonprofit Allen Institute for AI (AI2), has been acquired by Apple for about $200 million. A source close to the company corroborated a report this morning from GeekWire to that effect.

Apple confirmed the reports with its standard statement for this sort of quiet acquisition: “Apple buys smaller technology companies from time to time and we generally do not discuss our purpose or plans.” (I’ve asked for clarification just in case.)

Xnor.ai began as a process for making machine learning algorithms highly efficient — so efficient that they could run on even the lowest tier of hardware out there, things like embedded electronics in security cameras that use only a modicum of power. Yet using Xnor’s algorithms they could accomplish tasks like object recognition, which in other circumstances might require a powerful processor or connection to the cloud.

CEO Ali Farhadi and his founding team put the company together at AI2 and spun it out just before the organization formally launched its incubator program. It raised $2.7M in early 2017 and $12M in 2018, both rounds led by Seattle’s Madrona Venture Group, and has steadily grown its local operations and areas of business.

The $200M acquisition price is only approximate, the source indicated, but even if the final number were less by half that would be a big return for Madrona and other investors.

The company will likely move to Apple’s Seattle offices; GeekWire, visiting the Xnor.ai offices (in inclement weather, no less), reported that a move was clearly underway. AI2 confirmed that Farhadi is no longer working there, but he will retain his faculty position at the University of Washington.

An acquisition by Apple makes perfect sense when one thinks of how that company has been directing its efforts towards edge computing. With a chip dedicated to executing machine learning workflows in a variety of situations, Apple clearly intends for its devices to operate independent of the cloud for such tasks as facial recognition, natural language processing, and augmented reality. It’s as much for performance as privacy purposes.

Its camera software especially makes extensive use of machine learning algorithms for both capturing and processing images, a compute-heavy task that could potentially be made much lighter with the inclusion of Xnor’s economizing techniques. The future of photography is code, after all — so the more of it you can execute, and the less time and power it takes to do so, the better.

 

It could also indicate new forays in the smart home, toward which with HomePod Apple has made some tentative steps. But Xnor’s technology is highly adaptable and as such rather difficult to predict as far as what it enables for such a vast company as Apple.


Read Full Article

Can You Trust Your Model’s Uncertainty?




In an ideal world, machine learning (ML) methods like deep learning are deployed to make predictions on data from the same distribution as that on which they were trained. But the practical reality can be quite different: camera lenses becoming blurry, sensors degrading, and changes to popular online topics can result in differences between the distribution of data on which the model was trained and to which a model is applied, leading to what is known as covariate shift. For example, it was recently observed that deep learning models trained to detect pneumonia in chest x-rays would achieve very different levels of accuracy when evaluated on previously unseen hospitals’ data, due in part to subtle differences in image acquisition and processing.

In “Can you trust your model’s uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift, presented at NeurIPS 2019, we benchmark the uncertainty of state-of-the-art deep learning models as they are exposed to both shifting data distributions and out-of-distribution data. In this work we consider a variety of input modalities, including images, text and online advertising data, exposing these deep learning models to increasingly shifted test data while carefully analyzing the behavior of their predictive probabilities. We also compare a variety of different methods for improving model uncertainty to see which strategies perform best under distribution shift.

What is Out-of-Distribution Data?
Deep learning models provide a probability with each prediction, representing the model confidence or uncertainty. As such, they can express what they don’t know and, correspondingly, abstain from prediction when the data is outside the realm of the original training dataset. In the case of covariate shift, uncertainty would ideally increase proportionally to any decrease in accuracy. A more extreme case is when data are not at all represented in the training set, i.e., when the data are out-of-distribution (OOD). For example, consider what happens when a cat-versus-dog image classifier is shown an image of an airplane. Would the model confidently predict incorrectly or would it assign a low probability to each class? In a related post we recently discussed methods we developed to identify such OOD examples. In this work we instead analyze the predictive uncertainty of models given out-of-distribution and shifted examples to see if the model probabilities reflect their ability to predict on such data.

Quantifying the Quality of Uncertainty
What does it mean for one model to have better representation of its uncertainty than another? While this can be a nuanced question that often is defined by a downstream task, there are ways to quantitatively assess the general quality of probabilistic predictions. For example, the meteorological community has carefully considered this question and developed a set of proper scoring rules that a comparison function for probabilistic weather forecasts should satisfy in order to be well-calibrated, while still rewarding accuracy. We applied several of these proper scoring rules, such as the Brier Score and Negative Log Likelihood (NLL), along with more intuitive heuristics, such as the expected calibration error (ECE), to understand how different ML models dealt with uncertainty under dataset shift.

Experiments
We analyze the effect of dataset shift on uncertainty across a variety of data modalities, including images, text, online advertising data and genomics. As an example, we illustrate the effect of dataset shift on the ImageNet dataset, a popular image understanding benchmark. ImageNet involves classifying over a million images into 1000 different categories. Some now consider this challenge mostly solved, and have developed harder variants, such as Corrupted Imagenet (or Imagenet-C), in which the data are augmented according to 16 different realistic corruptions, each at 5 different intensities.
We explore how model uncertainty behaves under changes to the data distribution, such as increasing intensities of the image perturbations used in Corrupted Imagenet. Shown here are examples of each type of image corruption, at intensity level 3 (of 5).
We used these corrupted images as examples of shifted data and examined the predictive probabilities of deep learning models as they were exposed to shifts of increasing intensity. Below we show box plots of the resulting accuracy and the ECE for each level of corruption (including uncorrupted test data), where each box aggregates across all corruption types in ImageNet-C. Each color represents a different type of model — a “vanilla” deep neural network used as a baseline, four uncertainty methods (dropout, temperature scaling and our last layer approaches), and an ensemble approach.
Accuracy (top) and expected calibration error (bottom; lower is better) for increasing intensities of dataset shift on ImageNet-C. We observe that the decrease in accuracy is not reflected by an increase in uncertainty of the model, indicated by both accuracy and ECE getting worse.
As the shift intensity increases, the deviation in accuracy across corruption methods for each model increases (increasing box size), as expected, and the accuracy on the whole decreases. Ideally this would be reflected in increasing uncertainty of the model, thus leaving the expected calibration error (ECE) unchanged. However, looking at the lower plot of the ECE, one sees that this is not the case and that calibration generally suffers as well. We observed similar worsening trends for Brier score and NLL indicating that the models are not becoming increasingly unsure with shift, but instead are becoming confidently wrong.

One popular method to improve calibration is known as temperature scaling, a variant of Platt scaling, which involves smoothing the predictions after training, using performance on a held-out validation set. We observed that while this improved calibration on the standard test data, it often made things worse on shifted data! Thus, practitioners applying this technique should be wary of distributional shift.

Fortunately, one method degrades in uncertainty much more gracefully than others. Deep ensembles (green), which average the predictions of a selection of models, each of which have different initializations, is a simple strategy that significantly improves robustness to shift and outperforms all other methods tested.

Summary and Recommended Best Practices
In our paper, we explored the behavior of state-of-the-art models under dataset shift across images, text, online advertising data and genomics. Our findings were mostly consistent across these different kinds of data. The quality of uncertainty degrades under dataset shift, but there are promising avenues of research to mitigate this. We hope that deep learning users take home the following messages from our study:
  1. Uncertainty under dataset shift is a real concern that needs to be considered when training models.
  2. Improving calibration and accuracy on an in-distribution test set often does not translate to improved calibration on shifted data.
  3. Out of all the methods we considered, deep ensembles are the most robust to dataset shift, and a relatively small ensemble size (e.g., 5) is sufficient. The effectiveness of ensembles presents interesting avenues for improving other approaches.
Improving the predictive uncertainty of deep learning models remains an active area of research in ML. We have released all of the code and model predictions from this benchmark in the hope that it will be useful to the community to drive and evaluate future work on this important topic.

Formlabs CEO on the state of 3D printing and its remaining challenges


3D printing isn’t the buzzy, hype-tastic topic it was just a few years ago — at least not with consumers. 3D printing news out of CES last week seemed considerably quieter than years prior; the physical booths for many 3D printing companies I saw took up fractions of the footprints they did just last year. Tapered, it seems, are the dreams of a 3D printer in every home.

In professional production environments, however, 3D printing remains a crucial tool. Companies big and small tap 3D printing to design and test new concepts, creating one-off prototypes in-house at a fraction of the cost and time compared to going back-and-forth with a factory. Sneaker companies are using it to create new types of shoe soles from experimental materials. Dentists are using it to create things like dentures and bridges in-office, in hours rather than days.

One of the companies that has long focused on pushing 3D printing into production is Formlabs, the Massachusetts-based team behind the aptly named Form series of pro-grade desktop 3D printers. The company launched its first product in 2012 after raising nearly $3 million on Kickstarter; by 2018, it was raising millions at a valuation of over a billion dollars.


Read Full Article

Google Cloud gets a premium support plan with 15-minute response times


Google Cloud today announced the launch of its premium support plans for enterprise and mission-critical needs. This new plan brings Google’s support offerings for the Google Cloud Platform (GCP) in line with its premium G Suite support options.

“Premium Support has been designed to better meet the needs of our customers running modern cloud technology,” writes Google’s VP of Cloud Support, Atul Nanda. “And we’ve made investments to improve the customer experience, with an updated support model that is proactive, unified, centered around the customer, and flexible to meet the differing needs of their businesses.”

The premium plan, which Google will charge for based on your monthly GCP spent (with a minimum cost of what looks to be about $12,500 per month), promises a 15-minute response time for P1 cases. Those are situations when an application or infrastructure is unusable in production. Other features include training and new product reviews, as well as support for troubleshooting third-party systems.

Google stresses that the team that will answer a company’s calls will consist of “content-aware experts” that know your application stack and architecture. Like with similar premium plans from other vendors, enterprises will have a Technical Account manager who works through these issues with them. Companies with global operations can opt to have (and pay for) technical account managers available during business hours in multiple regions.

The idea here, however, is also to give GCP users more proactive support, which will soon include a site reliability engineering engagement, for example, that is meant to help customers “design a wrapper of supportability around the Google Cloud customer projects that have the highest sensitivity to downtime.” The Support team will also work with customers to get them ready for special events like Black Friday or other peak events in their industry. Over time, the company plans to add more features and additional support plans.

As with virtually all of Google’s recent cloud moves, today’s announcement is part of the company’s efforts to get more enterprises to move to its cloud. Earlier this week, for example, it launched support for IBM’s Power Systems architecture, as well as new infrastructure solutions for retailers. In addition, it also acquired no-code service AppSheet.


Read Full Article

Google finally brings its security key feature to iPhones


More than half a year after Google said Android phones could be used as a security key, the feature is coming to iPhones.

Google said it’ll bring the feature to iPhones in an effort to give at-risk users, like journalists and politicians, access to additional account and security safeguards, effectively removing the need to use a physical security key like a Yubico or a Google Titan key.

Two-factor authentication remains one of the best ways to protect online accounts. Typically it works by getting a code or a notification sent to your phone. By acting as an additional layer of security, it makes it far more difficult for even the most sophisticated and resource-backed attackers to break in. Hardware keys are even stronger. Google’s own data shows that security keys are the gold standard for two-factor authentication, surpassing other options, like a text message sent to your phone.

Google said it was bringing the technology to iPhones as part of an effort to give at-risk groups greater access to tools that secure their accounts, particularly in the run-up to the 2020 presidential election, where foreign interference remains a concern.


Read Full Article