CSCE 624 - Sketch Recognition

Sunday, October 2, 2016

Homework 2 - Segmentation

Overview
Following on our previous assignment about feature extraction and classification, we will now move into additional sketch-specific operations. Over the next two assignments, we will cover stroke segmentation (corner finding) and a couple of simple recognition algorithms. Homework 2 will focus on segmentation.

Instructions
Languages and Data
While there is no "required" language for this assignment, it is strongly recommended that you use Javascript. The primary reason is ready access to the data. As we move from having a couple hundred files of small, single-stroke sketches into more complicated tasks, we will be dealing with thousands of sketches, with multiple strokes and shapes that are of substantial size. Rather than downloading large datasets to your local computer, all data for this homework is provided via the Sketch Recognition Library database API (srlib_db), which was an optional data source in the first homework.

The data handlers for srlib_db are available in Javascript via srlib_js, which is included in a limited form in the Starter code available on the class drive under "Homework 2". The documentation for srlib_js is also included; it is a JSDoc directory and so should be downloaded and read through a web browser.

The specification of SketchML::JSON, which is the format of all sketches on srlib_db, is also available in the homework's folder. If you wish to use another language, that will be a useful reference, but any language that handles JSON-like objects natively will be easy to adapt. Furthermore, if you wish to use another language, you should make an HTTP GET request to http://srl-prod1.cs.tamu.edu:7750/getSketches?domain=MechanixCleaned&interpretation=force to retrieve the data. That particular request simply grabs the first 50 sketches, and the Starter code grabs 200 between 4000 and 4200. None of these numbers really matter as the main goal is to collect some pool of data for using with your segmentation algorithm.

Segmentation

The goal of the assignment is that, given a set of raw strokes, you can segment them successfully into a set of segmented substrokes. You may implement any corner finding / segmentation algorithm you would like. This can be an algorithm from class, a paper, or one of your own creation. A few points will be awarded based on creativity of your algorithm.

Try to make an algorithm which performs fairly well on the data set. All of the sketches in srlib_db for this domain (MechanixCleaned) will feature a set of raw strokes and segmented substrokes. These are provided for comparison of your algorithm. There are thousands of sketches, so you will not be able to make use of the provided substrokes to hardcode a solution. Furthermore, some of the sketches are imperfect, with the occasional dropped stroke and mis-identified substroke. They were generated by a sketch recognition software tool -- Mechanix in this case. As such, you should try to make the best segmentation algorithm you can, knowing that some disagreements will exist between your algorithm and the original dataset.

Finally, Mechanix is designed to recognize trusses, so it segments strokes into substrokes at points of nodes on trusses. You may choose to ignore this form of segmentation for the assignment if you wish to focus on corner finding. It will not impact you greatly on this homework; however, as the subsequent homework will be using the same data to implement a recognizer for arrows and trusses, you may benefit from some additional work in finding truss nodes as corners in this homework.

More details will be shared regarding evaluation metrics soon. For now, the focus should be on an algorithm which generates substrokes and adds them to a sketch object when only working from the raw stroke data.

Starter Code

As with Homework 1, there is provided Starter code in the directory for Homework 2. This file already handles loading data and converting it into the appropriate format. It also visualizes the original sketch data, prefixed "old", and your sketch data, prefixed "new", according to either the strokes or substrokes. Note that Mechanix sketches may be large and can get cutoff on the canvas.

Obtaining Credit

Again, the goal of this assignment is that you write a corner finding / segmentation algorithm. You will need to submit all of your source files to the grader. Also, include a small report describing what algorithm you implemented, the intuition of how it works, and the most difficult problems you encountered. If you run the metrics code, include those numbers as well.

Please ZIP all your files together into a single file submission titled "HW2_<last name>_<first initial>.zip". Since the university may block ZIP files, it is recommended that you upload your file to Google Drive and share the link via email.

More details will follow regarding the evaluation metrics. Remember that the Mechanix data is imperfect, which will be accounted for during the grading process; however, srlib_db has a large sketch library against which your work may be checked, so you should still strive to write an algorithm which performs well on the supplied data.

Although this assignment seems large because of the size of the data repository, it really only consists of a single algorithm, likely meaning less code than the first homework. Thus, you will have roughly a week and a half to complete the assignment.

Due Date
Oct. 16, Sunday @ Midnight (extended from Oct. 12)
25% deducted per day late

Friday, September 30, 2016

Reading 8 - ShortStraw

Paper
Wolin, A., Eoff, B., & Hammond, T. (2008, June). ShortStraw: A Simple and Effective Corner Finder for Polylines. In SBM (pp. 33-40).

Notes
Continuing on corner finding, we will now look at a very simple segmentation algorithm -- ShortStraw

Wednesday, September 28, 2016

Reading 7 - Sezgin and Stahovich

Paper
Sezgin, Tevfik Metin, Thomas Stahovich, and Randall Davis. "Sketch based interfaces: early processing for sketch understanding." ACM SIGGRAPH 2006 Courses. ACM, 2006.

Notes
Up to this point, we've been concerned with understanding and building features of strokes and using these features to describe and classify gestures. With that background on machine learning and the familiarity gained with working on sketch data, we are now moving into more sketch-specific problems. To begin with, we will look at corner detection for a few readings. One of the more important original works in corner detection also helps to communicate the intuition behind the features that are often used, so we will be starting with the Sezgin/Stahovich paper shared in the class drive.

Wednesday, September 21, 2016

Reading 6 - Geometry Overview

Paper
Hammond, T. "Fundamentals of Geometry". Draft

Notes
In the shared folder for Readings, you can access the draft version of Dr. Hammond's paper on basic geometry. This paper provides some background information on dealing with geometries, vector spaces, and matrices. Not only is the knowledge applicable to a number of classification methods, such as our current linear classification discussion, but it is also a good review of trigonometry and functions that we often deal with in feature extraction. As we move on to other methods of recognition, this intuition will be useful.

Sunday, September 18, 2016

Reading 5 - Metrics Overview

Paper
Hammond, T. "Evalutation Methods and Metrics". Draft

Notes
In the shared folder for Readings, you can access the draft version of Dr. Hammond's paper on evaluation metrics. This paper is a general overview of some common methods used in evaluation and accuracy reporting for machine learning problems. In this class, we'll be dealing with features and classifiers in lectures, homeworks, and more, so having a good understanding of the basics of evaluation will be very helpful.

Tuesday, September 13, 2016

Homework 1 - Features and Classification

Overview
This marks the first programming-based homework assignment of the course. The intention behind this assignment is to prepare you for the class project and give some practical experience using the material we've been discussing in lecture. You will need to implement feature transformations and perform some classification on those features for this homework. It may be considered across two parts:

1. Rubine Features
2. Weka Classification

Instructions
There is no required programming language for this homework. The only requirements are that you implement Rubine and use Weka. To measure your performance on these tasks, you will need to follow some data format guidelines -- use the data provided and make sure your feature extractor outputs or can be made to easily output a CSV file. These details will be discussed shortly, but keep them in mind when choosing how you wish to begin the homework.

Also, a "Homework 1" directory has been added inside the shared class Google Drive. It contains all the downloadable materials for this assignment.

1. Rubine Features

a. Data

First, you'll need the data. There are two data sets. One is the small sample set which we have seen in quizzes and in discussion. The sample CSV output is also available for this data set so that you may check your feature extractor before applying it to the second data set. The second data set is a collection of alphabet letters. There are 20 samples for each letter.

Data is available to you in three ways.

i. TXT Files

The TXT version of the data is the most rudimentary means of storing a sketch available. These files list only the points, with each line containing x, y, and t for a single point. Reading the raw data from these files should be relatively simple, but there's no inherent structure about the sketch that is saved.

ii. JSON files

The JSON version of the data is saved in the SketchML::JSON format, developed by the Sketch Recognition Lab and based on MIT's original SketchML format on the XML platform. This format is still fairly lightweight compared to XML, although not as compact as the TXT format. It provides structure to the data, giving a collection of points, strokes, and more to the sketch object. JSON is very easy to work with in many modern languages, but you may wish to reference the SketchML::JSON specification document if you decide to parse the JSON yourself. It is saved alongside the data in the same folder.

iii. Sketch Recognition Library API

The Sketch Recognition Library (srlib) is a collection of sketches from different domains that have been gathered into a single format (SketchML::JSON) and made available through a RESTful API. All the data used for this assignment may be accessed from the srlib API, but it is not available on the Google Drive. It is live online from http://srl-prod1.cs.tamu.edu:7750/. You must be on campus or on the VPN to access it. That link will direct you to the documentation. All you need to get started is that the "sample" data set is available with the domain "rubine" and the "letters" data set is available with the domain "letters". An example is included in the starter code discussed below.

b. Feature Extraction

Regardless of how you choose to get the data, you must implement all 13 of Rubine's features. The sample data set is provided to assist you in debugging, while the larger data set of letters will be used in the next stage with Weka.

To provide further assistance with this step, starter code has been included in the shared folder. The starter code does not implement any of the feature transforms. It is actually intended to be a viewer for you as another means of debugging. To use it as a viewer, download all the files in the starter folder and open "index.html" in your browser.

Because it uses the srlib database and includes a special build of the srlib Javascript toolkit (data management functions only), it is essentially the same code you would be writing to access the data from srlib in Javascript on your own. For that reason, I added a single line, a callback function to an empty "getFeatures()" function where you may implement your feature extractor if you wish to work with the starter code. Again, you may use any language you want, so there's no requirement to use the starter code. Even if you do wish to work in Javascript, you may want to look at the data in its raw TXT or JSON form to see how it looks. But because this homework is concerned with feature extraction and classification, not data handling and plotting, the viewer and Javascript tools are available as a resource.

c. Output

The reason the language doesn't matter is because the output will be important. Your program should generate a CSV file, or log console output which can be readily saved as CSV, where each row represents a sketch. Thus, for the sample data set, you will have 8 rows of features. The letters data set will have 20*26 rows.

Each row should contain the sketch's class/interpretation followed by the 13 Rubine feature values in order of F1 through F13. For the sample data set, the class can just be the name or ID; it doesn't really matter. For the letters data set, you should save the letter as the class. The letter is saved in the SketchML::JSON data under the top-level shape's "interpretation" field.

The file "sample.csv" provides the Rubine feature values for the sample data set to assist in your evaluation. You will need to generate "letters.csv" using your feature extractor.

2. Weka Classification

The second half of the assignment will not require any additional programming. For this part, you'll be using Weka to build some classifiers with the features you extracted in the previous part. Weka can import CSV directly, which is one of the primary reasons that your feature extractor should support saving CSV features... the other being ease of grading.

Once you have a "letters.csv" which has the class label and set of 13 features for all 520 sketches, import the data into the Weka Explorer. From Explorer mode, you can test out many different functions in Weka, including classification, dimensionality reduction, and visualization.

Try at least 5 different classifiers on the data. Think about mixing k-fold cross-validation and other data splitting methods. You should report the classifiers and settings you chose along with the results in your report to be submitted with the code and CSV files.

I've tried to include all the materials you'll need either online or in the shared class folder.

At the end of both parts of the homework, you'll have gained some familiarity with gesture-based sketch features and classification methods through Weka. Later, we'll be investigating the programming-based recognition a bit more, so you will probably have the chance to implement classifiers, segmenters, and other sketch algorithms in upcoming assignments.

Obtaining Credit

You will need to email the grader all of your files.

For simplicity, place all your files inside a single folder named "HW1_<last name>_<first name>" that will be compressed and submitted as a ZIP. You should include the following files:

+ If you decided to download a data set (JSON or TXT), include these files in a "data" directory.

+ Include the source code which reads the data and generates the features in a "source" directory. This can be a single file or multiple files, even a combination of scripts. If your source is more than a single file, please add a small README that tells which one to run.

+ Include your CSV files for both the sample and letters data sets in a "results" directory.

+ Include the output of the weka console after testing each of your 5 different classifiers and configurations in the same "results" directory with the CSV files. This can just be a TXT file that you make by copying-and-pasting from the weka window.

+ Include your report which (briefly) discusses the options you tried out in Weka and reports your findings in the top level. You should have at least a small discussion of your findings. Were you pleased with the results? Did the exercise help you think of different features which you feel would be more important? Any general impressions you had during either part of the assignment could be excellent material for a discussion. (e.g. "First, I tried a Random Forest with 10-fold cross validation and obtained an F-measure of 86.4%. Next, I wanted to see how a neural network would perform, so I ran a multilayer perceptron with...")

To recap, a submission might look like this:

HW1_Polsley_Seth/
|-- data/
|---- sample-json/ (contains all the sample json files)
|---- letters-json/ (contains all the letter json files)
|-- source/
|---- rubine.py
|-- results/
|---- sample.csv
|---- letters.csv
|---- wekalog.txt
|-- report.txt

This layout isn't an exact science. I mainly want to make sure you include everything to demonstrate that you completed each part, and the structure is intended to make it easier for you to show that. If you used the Sketch Recognition Library starter code, you can have many fewer files. If you wish to consolidate the weka log into your report, that is also fine. So another submission may be:

HW1_Polsley_Seth/
|-- index.html
|-- require.js
|-- srlib.js
|-- sample.csv
|-- letters.csv
|-- report.pdf

This is also fine. I mostly just need your code, the code output (csv files), weka output (weka log), and a short write-up (discussion/report). Include these all in some obvious manner, and it's ok.

Again, please ZIP all your files together into a single file submission titled "HW1_<last name>_<first initial>.zip"

As always, contact me if you have any questions.

Due Date
Sep. 26, Monday @ Midnight
25% deducted per day late

Monday, September 12, 2016

Reading 4 - Gesture Overview

Paper
Hammond, T. "Introduction to Gesture Recognition". Draft

Notes
In the shared folder for Readings, you can access the draft version of Dr. Hammond's paper on gesture recognition. It is an overview of all the previous gesture work we've discussed and provides many practical explanations and examples to help cement these concepts before we move on into classification techniques and other types of recognizers.