Wednesday, July 17, 2019

Automated Grading System

If bridges and make upings were made equal we s strike off computer softwargon package brass, then we would have disasters happening daily. I have heard this several times from umpteen people. It is sad but true. Buggy softwargon is the bane of the softw are industry. One of the ways of increasing software program product timber is by proper education. several(prenominal) professionals from the software industry too at screen out to this. They believe that a greater emphasis should be given to tonus and examening in university courses. But simply explaining the principles of software feature is non sufficient.Students tend to forget theory-based principles over time. Practical exposure and experience is evenly important. Students should be put in an environment where they butt joint appreciate the importance of eccentric software and potty experience the benefits of processes that enhance role. M both universities have a period of internship for the disciples in which they work in a software company and experience these factors first hand. that beca utilize the internship usually is of a duration of 3-6 months, it is not sufficient to ins work the importance of quality.Emphasis on autograph quality should be made a develop of the faultless software curriculum for it to have proper impact. every(prenominal) assignment that the scholarly persons submit should be subjected to the same quality standards that an industrial picture would be subjected to. Having university assignments adhere to industrial standards bequeath pass on in the susceptibility having to spend to a greater extent time equalisation the assignments. The energy backside no longer right give an assignment, abide for the scholars to submit it, and grade them. The efficiency must be more like a project coach who constantly mentors the scholarly persons and supporters them better the quality of their work.Along with spending a good amount of time mentori ng schoolchilds off figure hours an separate challenge is timely military rank of disciple assignments. Faculty members are already overloaded with the proletariat of t to each oneing, normaling projects, razing, and research. at one time we comprise interrogatory and quality into the curricula, each assignment allow have to be graded along many more dimensions, such(prenominal) as quality of the evidences, reportage of the ravels, etc. This female genital organ be very time consuming. We enquire a mechanism which will machine rifleally grade bookman assignments to the top hat possible extent, so that students are iven a timely feedback, and talent skunk focus more on providing feedback on the style, design, and documentation of the project. Such a system will also bring uniformity to the grading process and will eliminate discrepancies call fitted to instructors bias and lethargy. A good machine-controlled grading system should be cap commensurate of executing the test cases written by students as well as the module on the project, determining the coverage of the test cases, and compiling and executing the submitted programs. It should be configurable so that faculty atomic amount 50 determine the importance of various factors that make up the realise grade.Several efforts have been made to design and implement automated grading systems in universities. rough existing systems are 1. WEB-CAT1 2. Curator2 3. ASSYST3 4. Praktomat4 5. PGSE5 6. PILOT6 In this article I will briefly explain two such automated grading systems WEB-CAT, and the Praktomat systems, and propose a system that contains useful features from them as well as several(prenominal) new features. WEB-CAT WEB-CAT was created at Virginia Tech university to address the need for incorporating software testing as an integral part of all programming courses.The creators realized the need for a software to automatically grade student assignments to modify faster feedback to students and to balance the working load of faculty members. Since Test Driven training (TDD) was to be employ for all the assignments, the students had to be graded not scarcely on the quality of code, but also on the quality of their test cortege. WEB-CAT grades students on triplet criteria. It gives each assignment a test validity score, a test rightness score, and a code correctness score. Test validity measures the accuracy of the students tests. It determines if the tests are agreeable with the problem tatement. Test coverage determines how much of the starting time code the tests cover. It determines if all paths and conditionals are adequately covered. formula correctness measures correctness of the actual code. All three criteria are given a accepted weight-age and a nett score is determined. WEB-CATs graphical exploiter interface is inspired by the unit testing tool JUnit7. Just like JUnit it uses a jet bar to show the test results. A schoolbook description c ontaining details such as the takings of tests that were moderate, and the physique that passed is also provided. Basic features provided by WEB-CAT are Submission of student assignments using a nett based principal interface Submission of test cases using a vane based wizard interface Setup of assignments by faculty transfer of student scores by the faculty impulsive grading with immediate feedback for student assignment WEB-CAT follows a certain sequence of steps to assess a project submission. A submission is assessed only if it compiles successfully. If compilation fails, then a summary of errors is displayed to the user. If the program is compiled successfully then WEB-CAT will assess the project on various parameters.It first tests the correctness of the program by zip the students tests against the program. Since these tests are submitted by the students, and it is expected that deoxycytidine monophosphate% of the tests will pass, because we do not expect stud ents to submit a program that fails their bear tests. After this the students test cases are validated by running them against a summons capital punishment of the project created by the instructor. If a students test case fails on the fictitious character implementation then it is deemed to be invalid. Finally the coverage of the students test cases is evaluated.Once the scores are obtained a cumulative score out of carbon is calculated applying a certain formula on the scores from all criteria. The results are displayed immediately to the student on an HTML interface. It was observed that the quality of student assignments increased significantly after using WEB-CAT. It was tack together that the code developed using WEB-CAT contained 45% few defects per 1000 (non commented) lines of code8. Praktomat Praktomat was created at Universitat Passau in Germany. The purpose of creating Praktomat was to anatomy an environment which would help students enhance the quality of their co de.Along with automated grading it also has a focus on look check overs. The creators of Praktomat felt that re take out ining otherwises software and having ones software reviewed helps in producing better code. This is the reason why Praktomat has a strong focus on lucifer review and allows users to review as well as annotate code written by other students. Students can render their code any number of times bank the deadline. This way they can improve their code by adopting things they lettered by reviewing other students code as well as lessons they learned by others feedback of their aver code.Praktomat evaluates student assignments by running them against a test suite provided by the faculty. The faculty creates two test suites a public suite and a secret suite. The public suite is distributed to the students to help them validate their project. The secret test suite is not made ready(prenominal) to the students, but they are advised of its existence. An assignment is evaluated by automatically running both the test suites against it, and also by manual interrogative sentence by the faculty. Praktomat was developed in Python, and is hosted on SourceForge9. ObservationsMy quarrel that student project submissions should be backed by a process to get along lift out practices, and a software to automate as well as facilitate the process, has become stronger after reviewing WEB-CAT and Praktomat. What best practices should we incorporate in the process? What are the features that an automated grading software should contain? WEB-CAT, Praktomat, and several other software give a good starting point. We can learn from their successes and failures, and enhance the offering by adding our get experience. WEB-CAT and several other springs10 have sh hold us that TDD is in spades a good practice.In a university environment TDD will work best if it is complemented by instant feedback to the students. We loss to have a process that will encourage stud ents to improve the quality of their code. They should be graded on the best code they can submit process the deadline. Two things are needed for this instant feedback and the skill to resubmit assignments. WEB-CAT achieves this by assessing submissions in real time, and displaying the results to the students immediately. WEB-CAT allows students to re-submit assignments any number of time till the due date.Since faculty members are already overloaded with work, the software should transport some of the faculties responsibilities. WEB-CAT automatically evaluates and grades the students assignments, leaving faculty with time for more meaningful activities. Praktomat has shown us that there is a definite benefit to look review. When we review code written by others, we can go beyond the paradigms set in our own mind. Having our code reviewed by others can help us see our shortcomings which we may have earlier overlooked. Praktomat allows students to review code written by others .However the review is concealed from the faculty, to ensure that it does not impact grading. Praktomat does not swan on 100% automatic evaluation of the assignments. Praktomat evaluates certain aspects automatically and the rest are evaluated manually. Factors like code quality, documentation, etc are reviewed and evaluated manually by the faculty. There may be two reasons for this. Software to support automatic evaluation of these things may not have been lendable when Praktomat was written, or the creators felt that certain things are best evaluated by the faculty.A proposed system for automated grading establish on my observations from reviewing the above software systems and from my own experience, I have defined a process and the practicable expectations of a software system that supports TDD and automated grading. The carry through Every project should have a deadline, just like the real world The project should be defined as a set of use cases and a functional test su ite. Both should be made lendable to the students. Students should start developing their project using the TDD philosophy. They should also be provided a lineage code repository like CVS or VSS. Once the students have completed their project they should tag the build and should upload the tag number to a web based submission software. It must be clear defined how the students should submit their unit test suite. They should also provide one file which will foundation the remaining unit tests. The software will buy food the source from the repository, and evaluate it. o Failure is report to the student if the project fails to compile. Failure here does not mean that the student fails in the assignment. Assignments can be corrected and submitted any number of time till the deadline. Once the compilation succeeds, the software will run the unit tests written by the student on their code. o After collecting results from the unit tests, the test coverage is measured. o Then the functional tests created by the faculty are executed against the software. o The software is then run through a source code format checker which evaluates it for adherence to cryptograph standards,The software is then run through a source code quality checker which evaluates the quality of code based on known best practices, and anti patterns. o The software is final examinationly channeled to the faculty who evaluates it for design. Results from all the tests are given out of 100%. o After collecting all the results a formula (provided by the faculty) is applied to derive the final score. The Software The software should provide an account with a username and password to each student and faculty. The software should be web based so that it can be accessed from anywhere using a standard web browser. After logging in students should be able to browse to the homepage for a particular assignment and view the details, such as specification, due dates, and any other details posted by the faculty. When a student completes her assignment, she should be able to upload the CVS tag number to the server. Once the tag number is uploaded the server should pull the source code from a CVS repository and action the checks mentioned above. Results from each check is recorded in the database. The little result is then displayed to the student. Students should be able to resubmit an assignment any number of times till the deadline. Student code should be available for peer review and annotations if the faculty desires. The faculty should be able to create an assignment and upload details and files. The faculty should be able to trigger the final evaluation of all assignments either manually, or at a scheduled time. An evaluation should take the latest tag numbers provided by the student and perform tests on the respective source code. Results should be made available to the faculty, and students. The faculty should be able to add their own scores for parts that w ere checked manually. The final result is computed by applying a formula provided by the faculty. The final results should be downloadable as a csv text file. Several technologies such as Java, Python, PHP, . NET, and Ruby can be used to implement such a system. all(prenominal) have their pros and cons. We will not cover the implementation technology in this paper. Evaluation of these technologies and a final choice based on the evaluation will be dealt with in a separate paper.Reference 1. http//scholar. lib. vt. edu/theses/available/etd-05222003-225759/unrestricted/Web-CAT. pdf 2. http//www. cs. vt. edu/curator/PublicInfo/CuratorIntroduction. pdf 3. http//portal. cm. org/citation. cfm? id=268210 4. http//www. infosun. fmi. uni-passau. de/st/papers/iticse2000/iticse2000. pdf 5. Jones, E. L. Grading student programs a software testing approach. J. Computing in Small Colleges, 16(2) pp. 185-192. 6. http//www-2. cs. cmu. edu/rsbaker/pilot. pdf 7. http//www. junit. org 8. Using Test Driven Development in the Classroom Providing Students with Automatic, Concrete Feedback on Performance. http//web-cat. cs. vt. edu/grader/Edwards-EISTA03. pdf 9. http//sourceforge. net/projects/praktomat/ 10. http//www. testdriven. com

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.