A machine learning (ML) library for classification using a nearest neighbor algorithm based on Hamming distances.
You can incorporate the
You can use
Classification accuracy with datasets in the
First, install V, if not already installed. On MacOS, Linux etc. you need
In a terminal:
git clone https://github.com/vlang/v cd v make sudo ./v symlink # add v to your PATH
Clone this github repository:
cd ~ # go back to your home directory git clone https://github.com/holder66/vhamml
Install the needed dependencies:
v install vsl v install Mewzax.chalk
Go into the vhamml directory, compile the app, and try it out:
cd vhamml v . # compiles all the files in the folder ./vhamml --help # displays help information about the various commands # and options available. More specific help information # is available for each command.
v run . examples go
v up # installs the latest release of V git pull # When you're in the vhamml directory, this command pulls in the # latest version of vhamml v update # get the latest version of the libraries, including holder66.vhamml v . # recompile
The V lang community meets on Discord
For bug reports, feature requests, etc., please raise an issue on github
Speed things up:
Using the -c (--concurrent) flag makes use of available CPU cores may speed things up. A huge speedup happens if you compile using the -prod (for production) option. The compilation itself takes longer, but the resulting code is highly optimized.
v -prod .
And then run it, eg
./vhamml explore -s -c datasets/iris.tab
Examples showing use of the Command Line Interface
Please see examples_of_command_line_usage.md
Example: typical use case, a clinical risk calculator
Health care professionals frequently make use of calculators to inform clinical decision-making. Data regarding symptoms, findings on physical examination, laboratory and imaging results, and outcome information such as diagnosis, risk for developing a condition, or response to specific treatments, is collected for a sample of patients, and then used to form the basis of a formula that can be used to predict the outcome information of interest for a new patient, based on how their symptoms and findings, etc. compare to those in the dataset.
Example: finding useful information embedded in noise
Please see a worked example here: noisy_data.md
The mnist_train.tab file is too large to keep in the repository. If you wish to experiment with it, it can be downloaded by right-clicking on
The process of development in its early stages is described in
Copyright (c) 2017, 2023: Henry Olders.