François Konschelle
Past achievements
- TokenSpan class, cutting a string and the notion of tokenization in natural language processing. See also the more higher-level class iamTokenizing.
- SubstitutionString class, replacing a string by an other one witohut destroying information ; usefull for cleaning and normalizing natural language strings, version-control systems, filtering and de-noising sequence.
- BagOfWords class, the usual bag-of-words representation of a text, enlarged by many information theory and graph theory tools.
- Images processing, basic tools for convolution and morphology.
- Fractionnal time series, basic tool to generate fractional time series, either via direct calculation, or by fractional differentiation.
- Mixture of vonMises distribution, an Expectation-Maximization (EM) algorithm to fit a mixture of periodic statistical distribution.
- Financial forecasting using fundamentals studies in machine learning. Private project.
Skills
- Natural language processing, Information retrieval, Web of Knowledge :
- word embedding, from bag-of-words to word2vec and more recent
- de-noising, cleaning, normalizing, ... strings
- automatic annotators
- Time series :
- time-frequency analysis and modelling
- financial forecasting
- fractionnal auto-regressive models
- Python packages :
- NumPy, SciPy
- pandas
- scikit-learn, tensorflow
- nltk, spaCy
- beatifullsoup
- backtrader
- C and C++ basics of programming language.
- DataBases and co. :
- SQL
- noSQL
- Hadoop
- Apache Avro
Extra ressources