VHamML
A machine learning (ML) library for classification using a nearest neighbor algorithm based on Hamming distances.
You can incorporate the
VHamML
src/vhamml.v
You can use
VHamML
datasets
Classification accuracy with datasets in the
datasets
What, another AI package?
Is that necessary?
Installation:
First, install V, if not already installed. On MacOS, Linux etc. you need
git
In a terminal:
git clone https://github.com/vlang/v
cd v
make
sudo ./v symlink # add v to your PATH
Clone this github repository:
cd ~ # go back to your home directory
git clone https://github.com/holder66/vhamml
Install the needed dependencies:
v install vsl
v install Mewzax.chalk
Go into the vhamml directory, compile the app, and try it out:
cd vhamml
v . # compiles all the files in the folder
./vhamml --help # displays help information about the various commands
# and options available. More specific help information
# is available for each command.
That's it!
Tutorial:
v run . examples go
Updating:
v up # installs the latest release of V
git pull # When you're in the vhamml directory, this command pulls in the
# latest version of vhamml
v update # get the latest version of the libraries, including holder66.vhamml
v . # recompile
Getting help:
The V lang community meets on Discord
For bug reports, feature requests, etc., please raise an issue on github
Speed things up:
Using the -c (--concurrent) flag makes use of available CPU cores may speed things up. A huge speedup happens if you compile using the -prod (for production) option. The compilation itself takes longer, but the resulting code is highly optimized.
v -prod .
And then run it, eg
./vhamml explore -s -c datasets/iris.tab
Examples showing use of the Command Line Interface
Please see examples_of_command_line_usage.md
Example: typical use case, a clinical risk calculator
Health care professionals frequently make use of calculators to inform clinical decision-making. Data regarding symptoms, findings on physical examination, laboratory and imaging results, and outcome information such as diagnosis, risk for developing a condition, or response to specific treatments, is collected for a sample of patients, and then used to form the basis of a formula that can be used to predict the outcome information of interest for a new patient, based on how their symptoms and findings, etc. compare to those in the dataset.
Please see
clinical_calculator_example.md
Example: finding useful information embedded in noise
Please see a worked example here: noisy_data.md
MNIST dataset
The mnist_train.tab file is too large to keep in the repository. If you wish to experiment with it, it can be downloaded by right-clicking on
this link
wget http://henry.olders.ca/datasets/mnist_train.tab
The process of development in its early stages is described in
this essay
Copyright (c) 2017, 2023: Henry Olders.