The 8th Annual Conference

Big Data Finance 2020

THE BIGGEST 100% VIRTUAL EVENT

Thursday, JUNE 4, 2020

AI - Blockchain - Connectivity - Data - Executive Perspective

Cornell MFE

SELECTED PRESENTATIONS

Using Neural Network in a trading strategy A Practical Case

Alon Horesh
Alon Horesh
Co-founder at AlphaOverBeta
Abstract:

The AI component is a building block in the entire trading process Train samples - Sometimes less is more Live results must confirm the simulation (Another big data challenge ..) Prediction success rate of the regressor is less relevant than the overall strategy performance

View Speaker Profile


A Revolutionary Futures Exchange and Clearing House Trading the World`s Major Assets in a Creative New Way

Robert Krause
Robert Krause
Chairman and CEO at Demand Derivatives
Abstract:

1. Revamped exchange process 2. Reimagined clearing house (including blockchain clearing) 3. Four novel instrument designs 4. Major benefits and large cost reductions 5. Funding/Investment

View Speaker Profile


Backtesting in Digital Asset Markets: Analysis of Order Book vs. Candlestick Backtests

Austin Hubbell
Austin Hubbell
CEO and co-founder at Consilium Crypto
Abstract:

Overview of global digital asset market structure Exchange transaction data issues & validity Empirical backtest results using Candlestick Data Empirical backtest results using Order Book Data Comparison, analysis and best practices for backtesting in digital asset markets

View Speaker Profile


Fund2Vec: Identifying similar funds and ETFs using network theory and machine learning

Dhagash Mehta
Dhagash Mehta
Senior Investment Strategies Manager at Vanguard
Abstract:

Finding other similar products to a given product is a ubiquitous problem arising in most businesses. Identifying a similar fund or ETF to a competitor`s product has wide-spread applications ranging from proactively marketing and selling home-grown products, compare performances between a home-grown fund with its peers, to tax loss harvesting. The traditional methods are known not to capture all the nuances among the products from the raw data. We propose a radically new approach to first represent fund and their underlying assets as a network. Then, we use a sophisticated machine learning method called Node2Vec to learn a network representation. Finally we use this network embedding to identify similar products by computing node similarities in the representation space. Our preliminary results provide novel insights to the fund similarity problem as well as provides a data-driven approach to categorize a fund universe.

View Speaker Profile


Using AI and ML to Extract ESG and Intangible Risk Factors from Text

Stephen Malinak
Stephen Malinak
Chief Data and Analytics Officer at Truvalue Labs
Abstract:

Summary bullet points - Many factors drive stock returns, including top-down macro, bottom-up fundamentals, fund flows, and idiosyncratic events - Using AI, ML, and massive computational horsepower on pure price and volume timeseries often fails badly in out-of-sample trading - AI and ML tools are well-suited for mimicking how human analysts process unstructured data such as news, blog posts, research reports, and other text - Environmental, Social, Governance, (ESG) and other intangible risk data can be gleaned from 3rd party text sources for both short- and long-term trading signals - The resulting signals provide abnormal returns on par with some of the best known quant factors, and with lower factor risk

View Speaker Profile


Managing Crypto and Commodities Portfolios with Data Science

Irene Aldridge
Irene Aldridge
President at AbleMarkets
Abstract:

A lot of portfolio management theory is focused on equities. While Markowitz principles for portfolio diversification should work across financial instruments, the results are not always encouraging. We show that clustering, a data science technique, helps portfolio managers successfully diversify their long-only crypto and commodities portfolios only by tweaking their allocations along clustering frontiers.

View Speaker Profile


NLP for analysis of corporate statements

Andrew Chin
Andrew Chin
Chief Risk Officer and Head of Quantitative Research at AllianceBernstein
Abstract:

We analyze transcripts of earnings calls over the last 15+ years using different sentiment techniques to compare their efficacy for stock selection. We have several findings. First, sentiment-based techniques can differentiate between outperformers and underperformers over different time horizons. We test this by systematically scoring sentiment across earnings transcripts every month, ranking companies by their scores, and then calculating the returns of these companies over time. In addition, we find that these techniques work on US, Global ex-US and Emerging Market companies. Finally, we assess recent NLP innovations like BERT and conduct direct comparisons against dictionary-based techniques for predicting outperformance.

View Speaker Profile


Price Discovery in Bitcoin: The Impact of Unregulated Markets

Carol Alexander
Carol Alexander
Professor of Finance at University of Sussex
Abstract:

We analyse minute-level multi-dimensional information flows within and between bitcoin spot and derivatives. We show that perpetual swaps and futures traded on the unregulated exchanges Huobi, OKEx and BitMEX are much the strongest instruments for bitcoin price discovery and we examine potential determinants of their leadership strength. Prices on the regulated CME bitcoin futures and the US-based spot exchanges react to, rather than lead, price movements on the unregulated exchanges and they may do so relatively slowly. In a multi-dimensional setting including the main price leaders within futures, perpetuals and spot markets, the CME futures have a very minor effect on price discovery, even less than the spot exchanges Bitfinex, Bitstamp and Coinbase. Our findings highlight the persistent problems stemming from inconsistent regulation in bitcoin spot and derivatives markets, including insufficient price stability and lack of resistance to manipulative trading. We conclude that the SEC are correct to maintain such issues as their main concern for bitcoin ETF applications.

View Speaker Profile


Solving for non-stationarity and position sizing

Jacques Joubert
Jacques Joubert
Systematic Trader at Shell
Abstract:

In this lecture, we show how determining optimal position sizes and handling non-stationarity can be done with the help of a secondary model (meta-labelling, Dr. Marcos Lopez de Prado, 2018), which helps us both to filter out false positives and size our positions accordingly. We will evaluate the technique on the MNIST dataset and then transfer this knowledge to a toy trading strategy. We end off by highlighting some technical difficulties when building these meta-models.

View Speaker Profile


Alternative (big) data for credit risk evaluations

Sudip Gupta
Sudip Gupta
Professor, Director MS in QF Program at Gabelli School of Business, Fordham University
Abstract:

<br>

View Speaker Profile


AI and business ethics in Financial Markets

Daniel Liebau
Daniel Liebau
Founding Director at Lightbulb Capital
Abstract:

<br>

View Speaker Profile


SoKat Disrupts Financial Markets

Jim Liew
Jim Liew
Associate Professor in Finance at SoKat.co and at Johns Hopkins Carey Business School
Abstract:

- Dad’s 50-year investment horizon - Innovating From the Ivory Tower - Price transparency unleashes the floodgates for FinTech - Re-imaging Credit Market Solutions - Wall Street Changed Forever!

View Speaker Profile


Cluster Portfolios

Ali Akansu
Ali Akansu
Professor of Electrical & Computer Engineering at NJIT at NJIT
Abstract:

This paper discusses portfolio construction for investing in N given assets, e.g. constituents of the Dow Jones Industrial Average (DJIA) or large cap stocks, which is based on partitioning the investment universe into clusters. The clusters are determined from the trailing correlation matrix via an algorithm that uses thresholding of high-correlation pairs. We calculate the Principal Eigenvector of each cluster from its correlation matrix and the corresponding eigenportfolio. The cluster portfolios are combined into a single N-asset portfolio based on a weighting scheme for the clusters. Various tests conducted on components of DIA and a thirty-stock basket of large-cap stocks indicate that the new portfolios are superior to the DIA and other Mean-Variance portfolios in terms of risk-adjusted returns from 2009 to 2019. This gives convincing evidence that cluster-based portfolio can outperform passive investing

View Speaker Profile


Do not Rescale the Y Axis!

Ari Pine
Ari Pine
Co-founder at Digital Gamma
Abstract:

Data analysis when the rules change: everything other than last 2 months is a flat line.

View Speaker Profile


Think Probabilistically? Model Probabilistically. Applications of Tensorflow Probability in Finance.

Aaron Miles
Aaron Miles
Data Scientist at Covail
Abstract:

One of the common phrases heard in financial modeling is to think probabalistically. In other words, don`t just evaluate a decision based on what happened, evaluate on what was likely to happen, or what would be highly profitable if it had happened. But most of the time, there are not the tools available to examine the empirical distribution of possible outcomes. Tensorflow probability enables this kind of modeling. Tensorflow Probability is a module which effectivly enables Tensorflow to model various probability distributions (currently, there are about 80 supported). In other words, Tensorflow Probability enables one to fit full distributions rather than single points. From these distributions, you can estimate quantiles, cumulative probability at a given point, and better estimations of upside and downside potential. In this paper, I`ll be focusing on the Sinh-Arcsinh transformation of the normal distribution, due to the fact that it`s a very flexibile transformation where you can account for skew and tailweight. Given that financial data are often not normally distributed, and predicting fat tails can be just as, if not more, important than a point prediction. To do this I will be modeling daily stock return data to examine how effectively Tensorflow Probability captures different return distributions, demonstrate the flexibility of the approach in answering various types of questions, and discuss the computational benefits of using tensorflow as opposed to packages like gamlss.

View Speaker Profile


Corporate Venture Capital (CVC): Corporations Make Strange Bed-Fellows

Mick Simonelli
Mick Simonelli
Principal at Simonelli Innovation
Abstract:

-The nature of corporate venture capital -Major benefits of corporate venture capital -Major detractions of corporate venture capital -Practical examples of CVC/FinTech partnerships -Conclusion: Even though "strange", FinTech/CVC partnerships remain a viable option for growth

View Speaker Profile


Introducing irrationality in logical decision making using Deep Learning: a journey from Time Series to Anomaly Detection.

Konstantinos Aloupis
Konstantinos Aloupis
Head Data Scientist at European Dynamics
Abstract:

1. How to fortify the human brain in its Efficiency vs. Effectiveness battle. 2. The Art of Decision Making. 3. Learn to build in the new era: Stats and math is bricks and mortar. 4. Aesthetics of Artificial Intelligence Architectures: Two most common misconceptions about Deep Neural Networks. 5. Are we inadvertently human “robots” already?

View Speaker Profile


Optimal Trading Rules Detection under Concurrent Labels, Capital Constraints and Transaction Costs

Oleksandr Proskurin
Oleksandr Proskurin
CIO at Machine Factor Technologies
Abstract:

Labelling is a key part of any machine learning model. That is why the quantitative researcher needs to carefully chose the most optimal labelling method to solve a particular financial machine learning problem. Investment management problems require domain-specific labelling techniques such as triple-barrier, trend-scanning, etc. On practice, we need to chose optimal parameters (trading rules) for labelling. For example, triple-barrier method is a function of fix profit, stop-loss and maximum time in a position. Marcos Lopez de Prado suggests using Monte-Carlo simulations of a traded asset to define the most optimal trading rules. The approach tackles the problem of overfitting parameters on a single observed path. However, the researcher needs to take into account capital restrictions and transaction costs as a function of stop-loss/fix-profit levels. With tight fix profit and stop-loss levels, the algorithm will trade more often with smaller capital allocations dedicated to each trade leading to higher transaction costs. On the other side, by increasing maximum time in each trade, the system will trade less frequently with longer runs and bigger capital allocations. All factors described above may have a big impact on the algorithm`s risk-adjusted performance. The framework suggested in this paper shows how to choose optimal trading rules taking into account capital restrictions for meta-labelling system trading VIX futures. The paper answers several questions: * What is the most optimal trading rule taking into account capital restrictions, concurrent labels/bets and transaction costs? How to take into account custom position sizing system in optimal trading rules detection? * How increasing/decreasing accuracy rate/f1-score of machine learning model impacts expected average Sharpe ratio? Which accuracy/f1-score leads to Sharpe ratio required by the investment team/client?

View Speaker Profile


The story of Number Patterns & Global Stock Markets

Saurabh Pathak
Saurabh Pathak
Founder at numPlorer
Abstract:

Have we turned all the right stones of Natural Intelligence before moving onto Artificial Intelligence? Story of Number Patterns and global markets : How did we predict 12432, a precise top on Nifty, an year ago? Single algorithm to trade on Global Stock markets (why not? after all the Universe has a single language - Mathematics) Combining Mathematics with Behavioral psychology Asking 2nd WHY in Big Data to find more insightful analytical answers

View Speaker Profile


How data is transforming investment management

Dori Levanoni
Dori Levanoni
Partner, Investments, Chief Investment Strategist at First Quadrant, L.P.
Abstract:

View Speaker Profile


The Micro-price: Estimating the fair price, given the state of the order book.

Sasha Stoikov
Sasha Stoikov
Senior Research Associate at CFEM (Cornell Financial Engineering Manhattan)
Abstract:

The micro-price is the fundamental price of an asset, given the state of the order book. In this presentation, I will define it to be the limit of a sequence of expected mid-prices and provide conditions for this limit to exist. The micro-price may be expressed as an adjustment to the mid-price that takes into account the bid-ask spread and the imbalance. The micro-price can be estimated using high frequency data. I will provide an iPython notebook and a small data sample, so you can explore the methods and implement this on your own data sets.

View Speaker Profile


Machine learning and data-driven solutions for cost-efficient connectivity.

Sabidur Rahman
Sabidur Rahman
Researcher at UCDavis, AT&T Labs
Abstract:

Machine Learning (ML) and data-driven solutions have revolutionized many areas of technologies. Communication technology is also increasingly benefiting from such solutions. Automated network resource management powered by ML and data-driven solutions can help to reduce the cost of connectivity, to free up more bandwidths, to foster innovation on the connected services etc., leading to more connected society and businesses. Many time-consuming and complex tasks of network resource management are being automated; thanks to virtualization of network components, advancements in artificial intelligence, and insights learned from data. Sabidur Rahman`s research works with Networks Research Labs at UC Davis and AT&T Labs explore important problems in this area of research.

View Speaker Profile


ML-based portfolio construction

Marcos Lopez de Prado
Marcos Lopez de Prado
CIO and Professor of Practice at TRUE POSITIVE TECHNOLOGIES and Cornell University
Abstract:

View Speaker Profile


Predicting future stock market > structure by combining social and financial network information.

Tharsis Souza
Tharsis Souza
Vice President, Products at Two Sigma
Abstract:

• Social opinion structure is relevant to predict stock market correlation structure. • Proposed model improves market structure predictions in up to 40%. • Social media leads to improved models particularly in long-term predictions.

View Speaker Profile


Online Portfolio Selection: Pattern Matching

Alex Kwon
Alex Kwon
Researcher at Hudson & Thames Quantitative Research
Abstract:

Online Portfolio Selection is an algorithmic trading strategy that sequentially allocates capital among a group of assets to maximize the final returns of the investment. Leveraging modern computational techniques, the strategies process huge amounts of data to find the optimal combination of portfolio weights. We will primarily focus on a pattern matching strategy, Correlation Driven Nonparametric Learning, to provide a different approach on portfolio selection.

View Speaker Profile


Beyond the Buzz - Integrating ESG Data into Investment Processes

Alexandria Fisher
Alexandria Fisher
Senior Strategic Analyst at Government of Alberta, Ministry of Energy
Abstract:

Trust, Transparency, and ESG Data: Embracing Asymmetry and Approximation Customization is Crucial: Understanding Bias in ESG Scores Materiality Matters: Quiet the Noise to Add Value and Mitigate Risk Alpha and ESG: Internalizing Externalities Risk Mitigation and ESG: Volatility, Cumulative Risk, and Grey Rhinos Time Horizons: Long-Termism and the Efficacy of ESG Strategies

View Speaker Profile