Clash of Frameworks

Susant Achary
DataDrivenInvestor
Published in
6 min readNov 23, 2019

--

Tensorflow v/s PyTorch

Source: Bing Search

Most deep learning applications run on TensorFlow or PyTorch. A new analysis found that they have very different audiences.

Source:thegradient.pub

In 2019, the war for ML frameworks has two remaining main contenders: PyTorch and TensorFlow. Analysis suggests that researchers are abandoning TensorFlow and flocking to PyTorch in droves. Meanwhile in industry, Tensorflow is currently the platform of choice, but that may not be true for long.

PyTorch’s increasing dominance in research

The graph above shows the ratio between PyTorch papers and papers that use either Tensorflow or PyTorch at each of the top research conferences over time. All the lines slope upward, and every major conference in 2019 has had a majority of papers implemented in PyTorch.

CVPR, ICCV, ECCV — computer vision conferences
NAACL, ACL, EMNLP — NLP conferences
ICML, ICLR, NeurIPS — general ML conferences

Why do researchers love PyTorch?

  • Simplicity. It’s similar to numpy, very pythonic, and integrates easily with the rest of the Python ecosystem. For example, you can simply throw in a pdb breakpoint anywhere into your PyTorch model and it’ll work. In TensorFlow, debugging the model requires an active session and ends up being much trickier.
  • Great API. Most researchers prefer PyTorch’s API to TensorFlow’s API. This is partially because PyTorch is better designed and partially because TensorFlow has handicapped itself by switching APIs so many times (e.g. ‘layers’ -> ‘slim’ -> ‘estimators’ -> ‘tf.keras’).
  • Performance. Despite the fact that PyTorch’s dynamic graphs give strictly less opportunity for optimization, there have been many anecdotal reports that PyTorch is as fast if not faster than TensorFlow. It’s not clear if this is really true, but at the very least, TensorFlow hasn’t gained a decisive advantage in this area.

Findings:

Horace He used proxy data to determine whether users were from the research or business community.

  • To represent the research community, he surveyed abstracts submitted to five top AI conferences in 2018. He found an average increase of 275 percent in researchers using PyTorch, and an average decrease of roughly 0.5 percent for TensorFlow, over the year.
  • To track business users, he analyzed 3,000 job listings. Businesses looking for experience in TensorFlow outnumbered those asking for experience in PyTorch. He also surveyed articles on LinkedIn and found a ratio of 3,230 to 1,200 in favor of TensorFlow.
  • TensorFlow also outnumbered PyTorch in terms of GitHub stars used by coders to save repositories for later use. He considers this a key metric for tracking projects in production.

Competitive strengths:

  • TensorFlow has a large, well-established user base, and industry is typically slower to pick up on new technologies.
  • TensorFlow is much more efficient than PyTorch. Even modest savings in model run times can help a company’s bottom line.
  • PyTorch integrates neatly with Python, making the code simple to use and easy to debug.
  • According to He, many researchers prefer PyTorch’s API, which has remained consistent since the framework’s initial release in 2016.

Framework “Convergence”

PyTorch TorchScript

The PyTorch JIT is an intermediate representation (IR) for PyTorch called TorchScript. TorchScript is the “graph” representation of PyTorch. You can turn a regular PyTorch model into TorchScript by using either tracing or script mode. Tracing takes a function and an input, records the operations that were executed with that input, and constructs the IR. Although straightforward, tracing has its downsides.

Source: Bing Search

Tensorflow Eager

At the API level, TensorFlow eager mode is essentially identical to PyTorch’s eager mode, originally made popular by Chainer. This gives TensorFlow most of the advantages of PyTorch’s eager mode (ease of use, debuggability, and etc.)

However, this also gives TensorFlow the same disadvantages. TensorFlow eager models can’t be exported to a non-Python environment, they can’t be optimized, they can’t run on mobile, etc.

This puts TensorFlow in the same position as PyTorch, and they resolve it in essentially the same way — you can trace your code (tf.function) or reinterpret the Python code (Autograph).

Source: Bing Search

Current State of ML Frameworks

And thus we arrive at the current state of ML frameworks. PyTorch has the research market and is trying to extend this success to the industry. TensorFlow is trying to stem its losses in the research community without sacrificing too much of its production capabilities. It will certainly take a long time before PyTorch can make a meaningful impact in the industry — TensorFlow is too entrenched and industry moves slowly. However, the transition from TensorFlow 1.0 to 2.0 will be difficult and provides a natural point for companies to evaluate PyTorch.

The future will come down to who can best answer the following questions.

  • How much will researcher preference affect the industry? As the current crop of PhD’s starts to graduate, they’ll bring PyTorch with them. Is this preference strong enough that companies will choose PyTorch for hiring purposes? Will graduates start startups that are built on top of PyTorch?
  • Can TensorFlow’s eager mode catch up to PyTorch in usability? My impression from issue trackers and online communities is that TensorFlow Eager suffers heavily from performance/memory issues and that Autograph has its own share of issues. Google will be spending a large amount of engineering effort, but TensorFlow is encumbered with historical baggage.
  • How fast can PyTorch get to a production state? There are still many fundamental issues that PyTorch hasn’t addressed — no good quantization story, no mobile, serving, and etc. Until these are resolved, PyTorch won’t even be an option for many companies. Can PyTorch offer a compelling enough story for companies to make the switch? Note: The day this article was released, PyTorch announced support for both quantization and mobile. Both are still experimental, but represent significant progress on this front for PyTorch.
  • Will Google’s isolation in industry hurt it? One of the primary reasons Google pushes for TensorFlow is to help its burgeoning cloud service. Since Google is trying to own the entire ML vertical, this incentivizes the companies Google is competing with (Microsoft, Amazon, Nvidia) to support the only alternative machine learning framework.

What Next ?

In the midst of all these conflicting interests, and all the money thrown around machine learning, it’s nice to take a step back. Most of us don’t work on machine learning software for the money or to assist in our company’s strategic plans. We work in machine learning because we care — about advancing machine learning research, about democratizing AI, or maybe just about building cool stuff. Whether you prefer TensorFlow or PyTorch, we’re all just trying to make machine learning software the best it can be.

Do read the complete source article in :

Keep Reading (_/\_).

If liked :-)

--

--