¡El Hotel Infinito!

Originally posted on Phitagóricos:
Jefferson Cuando estaba estudiando cálculo I en la facultad de Ciencias con Jefferson (¡larga vida a Jeff!), inició el curso con un cuento que siempre recordé como “El cuento de los hoteles infinitos”. Gracias a ello comprendí mucho sobre los infinitos y es la base que todo matemático debe tener.…

Phitagóricos

Jefferson Jefferson

Cuando estaba estudiando cálculo I en la facultad de Ciencias con Jefferson (¡larga vida a Jeff!), inició el curso con un cuento que siempre recordé como “El cuento de los hoteles infinitos”. Gracias a ello comprendí mucho sobre los infinitos y es la base que todo matemático debe tener.

He estado preparando algunas entradas sobre el conjunto de Cántor o cosillas raras que encuentro por ahí, pero como debo investigar y acomodar bien las cosas antes de publicarlas, me tomará un poco más de tiempo. Así que les dejo el cuento de los hoteles infinitos que a mí tanto me ha gustado y que espero sea de su agrado también.

El hotel extraordinario o el viaje mil uno de Ion el silencioso por Stanislaw Lem

Regresé a casa bastante tarde -la reunión en el club “Nebulosa de Andrómeda” se alargó hasta después de la media noche. Terribles pesadillas me acosaron cuando…

View original post 2,561 more words

Why You Should Use Cross-Entropy Error Instead Of Classification Error Or Mean Squared Error For Neural Network Classifier Training

Originally posted on James D. McCaffrey:
When using a neural network to perform classification and prediction, it is usually better to use cross-entropy error than classification error, and somewhat better to use cross-entropy error than mean squared error to evaluate the quality of the neural network. Let me explain. The basic idea is simple but…

James D. McCaffrey

When using a neural network to perform classification and prediction, it is usually better to use cross-entropy error than classification error, and somewhat better to use cross-entropy error than mean squared error to evaluate the quality of the neural network. Let me explain. The basic idea is simple but there are a lot of related issues that greatly confuse the main idea. First, let me make it clear that we are dealing only with a neural network that is used to classify data, such as predicting a person’s political party affiliation (democrat, republican, other) from independent data such as age, sex, annual income, and so on. We are not dealing with a neural network that does regression, where the value to be predicted is numeric, or a time series neural network, or any other kind of neural network.

Now suppose you have just three training data items. Your neural network…

View original post 953 more words

A gentle introduction to Naïve Bayes classification using R

Originally posted on Eight to Late:
Preamble One of the key problems of predictive analytics is to classify entities or events based on a knowledge of their attributes.  An example: one might want to classify customers into two categories, say, ‘High Value’ or ‘Low Value,’ based on a knowledge of their buying patterns.  Another example: to…

Eight to Late

Preamble

One of the key problems of predictive analytics is to classify entities or events based on a knowledge of their attributes.  An example: one might want to classify customers into two categories, say, ‘High Value’ or ‘Low Value,’ based on a knowledge of their buying patterns.  Another example: to figure out the party allegiances of  representatives based on their voting records.  And yet another:  to predict the species a particular plant or animal specimen based on a list of its characteristics. Incidentally, if you haven’t been there already, it is worth having a look at Kaggle to get an idea of some of the real world classification problems that people tackle using techniques of predictive analytics.

Given the importance of classification-related problems, it is no surprise that analytics tools offer a range of options. My favourite (free!) tool, R, is no exception: it has a plethora of state of the art packages…

View original post 3,076 more words

A gentle introduction to decision trees using R

Originally posted on Eight to Late:
Introduction Most techniques of predictive analytics have their origins in probability or statistical theory (see my post on Naïve Bayes, for example).  In this post I’ll look at one that has more a commonplace origin: the way in which humans make decisions.  When making decisions, we typically identify the…

Eight to Late

Introduction

Most techniques of predictive analytics have their origins in probability or statistical theory (see my post on Naïve Bayes, for example).  In this post I’ll look at one that has more a commonplace origin: the way in which humans make decisions.  When making decisions, we typically identify the options available and then evaluate them based on criteria that are important to us.  The intuitive appeal of such a procedure is in no small measure due to the fact that it can be easily explained through a visual. Consider the following graphic, for example:

Figure 1: Example of a simple decision tree (Courtesy: Duncan Hull) Figure 1: Example of a simple decision tree (Courtesy: Duncan Hull)

(Original image: https://www.flickr.com/photos/dullhunk/7214525854, Credit: Duncan Hull)

The tree structure depicted here provides a neat, easy-to-follow description of the issue under consideration and its resolution. The decision procedure is based on asking a series of questions, each of which serve to further reduce the…

View original post 2,675 more words

Yet Another Lambda Tutorial

Originally posted on Python Conquers The Universe:
There are a lot of tutorials[1] for Python’s lambda out there. A very helpful one is Mike Driscoll’s discussion of lambda on the Mouse vs Python blog. Mike’s discussion is excellent: clear, straight-forward, with useful illustrative examples. It helped me — finally — to grok lambda, and led…

Python Conquers The Universe

There are a lot of tutorials[1] for Python’s lambda out there. A very helpful one is Mike Driscoll’s discussion of lambda on the Mouse vs Python blog. Mike’s discussion is excellent: clear, straight-forward, with useful illustrative examples. It helped me — finally — to grok lambda, and led me to write yet another lambda tutorial.


Lambda is a tool for building functions

Lambda is a tool for building functions, or more precisely, for building function objects. That means that Python has two tools for building functions: def and lambda.

Here’s an example. You can build a function in the normal way, using def, like this:

or you can use lambda:

Here are a few other interesting examples of lambda:


What is lambda good for? Why do we need lambda?

Actually, we don’t absolutely need lambda; we could get along without it. But there are certain situations…

View original post 1,906 more words

Mercer’s Theorem and SVMs

Originally posted on Patterns of Ideas :
In a funny coincidence, this post has the same basic structure as my previous one: proving some technical result, and then looking at an application to machine learning. This time it’s Mercer’s theorem from functional analysis, and the kernel trick for SVMs. The proof of Mercer’s theorem mostly follows…

Patterns of Ideas

In a funny coincidence, this post has the same basic structure as my previous one: proving some technical result, and then looking at an application to machine learning. This time it’s Mercer’s theorem from functional analysis, and the kernel trick for SVMs. The proof of Mercer’s theorem mostly follows Lax’s Functional Analysis.

1. Mercer’s Theorem

Consider a real-valued function $latex {K(s,t)}&fg=000000$, and the corresponding integral operator $latex {mathbf{K}: L^2[0,1]rightarrow L^2[0,1]}&fg=000000$ given by

$latex displaystyle (mathbf{K} u)(s)=int_0^1 K(s,t) u(t), dt.&fg=000000$

We begin with two facts connecting the properties of $latex {K}&fg=000000$ to the properties of $latex {mathbf{K} }&fg=000000$.

Proposition 1. If $latex {K}&fg=000000$ is continuous, then $latex {mathbf{K} }&fg=000000$ is compact.

Proof: Consider a bounded sequence $latex {{f_n}_{n=1}^{infty} subset L^2[0,1]}&fg=000000$. We wish to show that the image of this sequence, $latex {{mathbf{K} f_n}_{n=1}^{infty}}&fg=000000$, has a convergent subsequence. We show that $latex {{mathbf{K} f_n}}&fg=000000$ is equicontinuous, and Arzela-Ascoli then gives a…

View original post 848 more words

Picking a colour scale for scientific graphics

Originally posted on Better Figures:
Here are some recommendations for making scientific graphics which help your audience understand your data as easily as possible. Your graphics should be striking, readily understandable, should avoid distorting the data (unless you really mean to), and be safe for those who are colourblind. Remember, there are no really “right”…

Better Figures

Here are some recommendations for making scientific graphics which help your audience understand your data as easily as possible. Your graphics should be striking, readily understandable, should avoid distorting the data (unless you really mean to), and be safe for those who are colourblind. Remember, there are no really “right” or “wrong” palettes (OK, maybe a few wrong ones), but studying a few simple rules and examples will help you communicate only what you intend.

What kind of palettes for maps?

For maps of quantitative data that has an order, use an ordered palette. If data is sequential and is continually increasing or decreasing then use a brightness ramp (e.g. light to dark shades of grey, blue or red) or a hue ramp (e.g. cycling from light yellow to dark blue). In general, people interpret darker colours as representing “more”. These colour palettes can be downloaded from Color…

View original post 1,117 more words

Will AI Replace Radiologists?

Originally posted on Über-Coder:
Ever since Geoffrey Hinton, who is considered the father of deep learning, said in a conference of radiologists that the field will soon be taken over by AI(Artificial Intelligence), a huge debate has erupted among radiologists and AI experts, whether this is a possibility in the foreseeable future or not. If…

Über-Coder

artificial-intelligence-698122_960_720

Ever since Geoffrey Hinton, who is considered the father of deep learning, said in a conference of radiologists that the field will soon be taken over by AI(Artificial Intelligence), a huge debate has erupted among radiologists and AI experts, whether this is a possibility in the foreseeable future or not.
If we look closely, the answer is clearly; No. Radiologists will not be replaced by AI systems. However, the nature of the job of radiologists will change.
The reason why there is confusion in the first place is that radiology(majorly) is a branch of medicine. And medicine is largely misunderstood beyond the practitioners of this field. Furthermore, there are very few people in the world who have a good knowledge of both AI and medicine and also statistical inference associated with it.
In classical texts, medical diagnosis is always described as a protocol where a set of procedures are followed…

View original post 1,019 more words

Sample Size Estimation for Machine Learning Models Using Hoeffding’s Inequality

Originally posted on Über-Coder:
Wassily Hoeffding (1914 -1991) was one of the founding fathers of non- parametric statistics (picture credit: http://stat-or.unc.edu) Deep learning is the talk of town these days and with advent of frameworks like Tensorflow, Keras and SciKitlearn etc. anyone can implement it with ease. This is why the first hunch of everyone…

Über-Coder

hoeffding
Wassily Hoeffding (1914 -1991) was one of the founding fathers of non- parametric statistics (picture credit: http://stat-or.unc.edu)

Deep learning is the talk of town these days and with advent of frameworks like Tensorflow, Keras and SciKitlearn etc. anyone can implement it with ease. This is why the first hunch of everyone when dealing with data is to someway apply deep learning to it or at-least some form of machine learning. However, what most of us don’t realize is that; to have a theoretical guarantee over learning and and then testing in such a way that error is minimized when the model is deployed in the real-world, we need considerably large data sets. And such large data sets are very hard to get.

This theoretical guarantee is of utmost importance when dealing with medical or health related data because to generate confidence intervals (values between which your point predictors in-sample can…

View original post 467 more words

How to configure local computer for FastAI course

Originally posted on Code Yarns 👨‍💻:
I wanted to check out the Practical Deep Learning for Coders course by FastAI. However, I noticed that the course provided configuration instructions mainly for cloud GPU instance providers like Paperspace. I have a notebook and a desktop computer with powerful NVIDIA GPUs and wanted to try the course on…

Code Yarns ?????????

I wanted to check out the Practical Deep Learning for Coders course by FastAI. However, I noticed that the course provided configuration instructions mainly for cloud GPU instance providers like Paperspace. I have a notebook and a desktop computer with powerful NVIDIA GPUs and wanted to try the course on my local machines. The course material is also provided in the form of Jupyter notebooks, while I intended to turn those into Python programs to run locally.

Here are the steps I followed to get my local computer setup for the FastAI course:

  • The local computer was running Ubuntu 16.04 and NVIDIA drivers were already installed on it and working.
  • CUDA 9.0 was installed using the online instructions from NVIDIA.
  • The latest release of CuDNN was installed as described here.
  • Conda was installed and configured as described here. You should be able to run conda info from…

View original post 149 more words