I recently posted an article featuring a non traditional approach to find large prime numbers. The research section of this article offers interesting challenges, both for data scientists interested in mathematics, and for mathematicians interested in data science and big data. My approach is data, pattern recognition, and machine learning heavy. Here is the introduction:
Large prime numbers have been a topic of considerable research, for its own mathematical beauty, as well as to develop more powerful cryptographic applications and random number generators. In this article, we show how big data, statistical science (more specifically, pattern recognition) and the use of new efficient, distributed algorithms, could lead to an original research path to discover large primes. Here we also discuss new mathematical conjectures related to our methodology.
Much of the focus so far has been on discovering raw large primes: Any time a new one, bigger than all predecessors, is found, it gets a lot of attention even beyond the mathematical community. Here we explore a different path: finding numbers (usually not primes) that have a very large prime factor. In short, we are looking for special integer-valued functions f(n) such that f(n) has a prime factor bigger than n, hopefully much bigger than n, for most values of n.
Source for picture: click here
The distribution of the largest prime factor has been studied extensively. If we choose a function that grows fast enough, one would expect that the largest prime factor of f(n) will always be larger than n. However, this would lead to intractable factoring issues to find the large primes in question. So in practice, we are interested in functions f(n) that do not grow too fast. The problem is that many, if not most very large integers, are friable : their largest prime factor is a relatively small prime. I like to call them porous numbers. So the challenge is to find a function f(n) that is not growing too fast, and that somehow produces very few friable numbers as n becomes extremely large. Read the full article here.
For another interesting challenge, read the section “Potential Areas of Research” in my article How to detect if numbers are random or not. For other articles featuring difficult mathematical problems, click here. For a statistical problem with several potential applications (clustering, data reduction) click here and read the last section. More challenges can be found here. .

