Be patient. The tests for project 3 can take some time to complete, even if you have an efficient approach.
Project 3 Multithreading
For this project, you will extend your previous project to support multithreading. In addition to meeting the previous project requirements, your code must make a thread-safe inverted index, and use a work queue to build and search an inverted index using multiple threads.
Functionality
The core functionality of your project must satisfy the following requirements:
-
Maintain the functionality of the previous project.
-
Process additional command-line parameters to whether to use multithreading and if so, how many threads to use in the work queue. See the Input section for specifics.
-
Support the same output capability as before. See the Output section for specifics.
-
Create a custom read/write lock class that allows multiple concurrent read operations, but non-concurrent write and read/write operations.
-
Create a thread-safe inverted index using the custom read/write lock class above.
-
Use a work queue to build your inverted index from a directory of files using multiple worker threads. Each worker thread should parse a single file.
-
Use a work queue to search your inverted index from a file of multiple word queries (for both exact and partial search). Each worker thread should handle an individual query (which may contain multiple words).
-
Exit gracefully without calling
System.exit()
when all of the building and searching operations are complete. -
Either extend your classes from previous projects or create completely new classes to support multithreading. DO NOT REMOVE YOUR SINGLE THREADING CODE, as you still need to support single threaded building and searching the index.
-
You may NOT use any of the classes in the
java.util.concurrent
package.
The functionality of your project will be evaluated with various JUnit tests. Please see the Testing section for specifics.
Input
Your main method must be placed in a class named Driver
. The Driver
class should accept the following additional command-line arguments:
-
-threads num
threads where-threads
indicates the next argumentnum
is the number of threads to use. Ifnum
is missing or an invalid number of threads are provided, please default to 5 threads.If the
-threads
flag is not provided, then assume your project should be single-threaded and should execute exactly as previous projects.
The command-line flag/value pairs may be provided in any order, and the order provided is not the same as the order you should perform the operations (i.e. always build the index before performing search, even if the flags are provided in the other order).
Your code should support all of the command-line arguments from the previous project as well.
Output
The output of your inverted index and search results should be the same from the previous project. As before, you should only generate output files if the necessary flags are provided.
Testing
Unlike previous projects, you do not have to pass 100% of the tests in the Project3Test.java
group of JUnit tests to satisfy the project functionality requirements. Instead, you only have to pass the ThreadTest.java
group of JUnit tests to earn your functionality grade and sign up for your first project 3 code review.
However, to pass code review, you must pass 100% of the tests in the Project3Test.java
group of JUnit tests and your project 3 code must also be faster than your project 2 code. You can test this on a lab computer using the benchmark
command as follows:
/home/public/cs212/benchmark gituser r2 r3
… where gituser
specifies your Github username, r2
specifies the release version of project 2 to test, and r3
specifies the release version of project 3 to test. For example:
/home/public/cs212/benchmark sjengle v2.2.1 v3.1.7
This script will automatically download, compile, and compare your project v2.2.1
and project v3.1.7
releases for the sjengle
username on Github.
This script takes awhile to run. There is no point in running the benchmark
script until you are passing the StressTest.java
group of JUnit tests.
Hints
It is important to develop the project iteratively. One possible breakdown of tasks are:
-
Get
log4j2
working and start adding debug messages in your code. Once you are certain a class is working, disable debug messages for that class in yourlog4j2.xml
file. -
Consider extending your previous inverted index to create a thread-safe version. You may need to add additional functionality to your single-threaded versions.
-
Create a thread-safe inverted index using the
synchronized
keyword. Do not worry about efficiency or using a custom lock class yet. -
Modify how you build your index (both from a directory and from the web) to use multithreading and a work queue. Make sure you still pass the unit tests.
-
Modify how you search your index to use multithreading and a work queue. Make sure you still pass the unit tests.
-
Once you are sure this version works, convert your inverted index to use a custom read/write lock class. Make sure you still pass the unit tests.
-
Start worrying about efficiency. Make sure you are not under or over synchronizing, and that your multithreaded code is faster on average than your single-threaded code.Do not worry about efficiency until after your first code review. -
Test your code in a multitude of settings and with different numbers of threads. Some issues will only come up occasionally, or only on certain systems.
-
Test your code with logging enabled, and then test your code with logging completely disabled. Your code will run faster without logging, which sometimes causes some concurrency problems.
-
Lastly, do not start on this project until you understand the multithreading code shown in class. If you are stuck on the code shown in class, PLEASE SEEK HELP. You do not need to figure it out on your own—you can ask the CS tutors, the teacher assistant, or the instructor for help.
The important part will be to test your code as you go. The JUnit tests provided only test the entire project as a whole, not the individual parts. You are responsible for testing the individual parts themselves.