Index ¦ Archives ¦ Atom

Notes on the limiting distribution of incomplete U-statistics

The main results on the limiting distributions of incomplete U-statistics were developed in Blom (1976)1 and Janson (1984)2; Lee (1990)3 gives a summary. However, for my taste, the proofs in Janson (1984) are somewhat hard to read. Lee (1990) improves upon those but has some inaccuracies---the main …


Highlighting pandas .to_latex() output in bold face for extreme values

When preparing a table with experimental results for publication, one often wishes to highlight the output of extreme cells, for example by putting them in bold.

However, pandas does not readily support this use case. The closest one gets with stock pandas is table.style.highlight_max(axis=1) which highlights …


Affine transformations in python

Here is just a short note on how to do an affine transformation in python succinctly.

Code

import numpy as np
import matplotlib.pyplot as plt
data = np.ones((1000,3)) # needed for the transformation
data[:,0:2] = np.random.uniform(low=-1,high=1,size=(len(data),2)) # create …

Understanding the local outlier factor (LOF) algorithm

This article explains the well-known LOF-algorithm. We provide intuition for density-based outlier detection, show the problems inherent to this task and then take a look at how LOF solves these problems.

Motivation

Outlier detection is important for many real world applications - typical examples include fraud detection, network intrusion detection or …


The repertoire method and the radix-based solution to the Josephus problem

This post summarizes how one uses the repertoire method (as presented in Concrete Mathematics by Graham, Knuth and Patashnik). First we look at the repertoire method without the need for a radix-based solution and afterwards we discuss the solution given in the book for Exercise 16.

General method

Suppose we …


A hint of Eclipse in Visual Code

Professionally I am a heavy eclipse user. I heard that IntelliJ is better by now but I have yet to make the switch. However, the focus of eclipse clearly is Java programming. For my own projects I tend to use python - and while vim is awesome I prefer the support …


Nice LaTeX plots with matplotlib

For my thesis I want the plots from jupyter notebook to integrate well with the rest of the LaTeX document. The article shows the settings necessary to achieve this look consistently by modifying the matplotlibrc. Using this approach the correct settings are always used and I can't forget them.


Data Mining Cup 2018

As part of a practical course from IPD we took part in this years' Data Mining Cup (DMC) sponsored by prodsys.

The DMC is a yearly competition where teams from universities around the world try to solve a data mining task. To quote the task from the official website:

The …


Redis Cluster als Basis für einen verteilten Chat

Wie schreibt man einen ausfallsicheren Chat für ein Uni-Projekt? Der Post detailliert die mögliche Umsetzung eines verteilten Chat-Systems auf Basis von Redis mit einem Java-Client. Der Load Balancer bleibt als Single Point of Failure - eine mögliche Lösung sind Elastic IP Adressen.


Getting started with vagrant and ansible

A short summary of the steps necessary to install vagrant on a linux box and to start the first virtual machine with a precise32 base image.


Speedup vagrant execution with a ramdisk

As I am using vagrant for a university project I try to put the virtual machine onto a ramdisk for faster startup times. The post shows that when loading the machine from a ramdisk the startup time does improve. However, the machine itself does not run faster.

© Florian Kalinke. Built using Pelican. Theme by Giulio Fidente on github.