Friday, 4 April 2014

Survey Testing

While developing the collection survey for the placement program, two revisions were pilot tested. The surveys were tested with a large group of participants (~30 for the first one and ~70 for the second one). This was partially done to involve students in the development process for the program. The first revision (here) included a number of features that proved to be problematic.

The personality and skills sections formed the main body of the survey. Each contained six questions. For each question participants were asked to rate themselves on a scale from 1 (least like me) to 10 (most like me), and then rank the relative importance of the questions from 1 to 6. The numbering from 1 to 10 did not work as most participants equated 10 with good and 1 with bad, leading to very skewed results. Additionally, the importance ranking was rarely applied correctly. In retrospect, the differing scales 1 to 10 and 1 to 6 should have been enough for me to realize that participants would struggle to understand what was expected.

There was a basic demographics section consisting of six questions and the same importance ranking as the personality and skills sections. The section turned out to be too basic; participants wanted to supply more demographic information about themselves. The ranking was more frequently applied correctly in this section, but it was still a source of trouble in the survey. The significance of the ranking was frequently misunderstood, and every participant ranked location as the most important factor.

The open-ended questions worked fairly well. They highlighted some of the problem areas and suggested features that needed extension. Unfortunately, they were also heavily dominated by comments along the line of '...but what about my particular case that is unique and special...'.

The second revision of the survey (here), attempted to resolve some of the issues from the initial survey. Additional information was added in the form of more explicit instructions on the first page and a section detailing special case restrictions on the third page.

For this survey the demographics section was expanded and the importance ranking was dropped. Questions in the personality and skills sections were reworded slightly to give a better sense that responses lie on a spectrum, as opposed to being either right or wrong. More importantly, the number scale 1 to 10 was eliminated and replaced with the language 'least like me' and 'most like me'.

Unfortunately, the importance ranking continued to be a frequent source of confusion. In general, the process of collecting two pieces of information in one section (ranking and self-rating) does not work well on a paper survey. When the survey was shifted to a digital format, those components were separated out onto two consecutive pages. In the future I would separate the components for the paper test as well.

The open-ended questions were unchanged, but the other changes to the survey led to more helpful responses in that section. Compared to the first survey there were more responses indicating useful extensions to, rather than highlighting the obvious failures of, the survey.

The final version of the survey that appears on the website is different again based on the results of the second pilot test. The demographics section has been expanded further, now collecting 18 items in 10 questions. Question text has again been reworded in some instances. And, finally, all numbers have been dropped from the survey. Questions are not numbered and all responses are on a scale presented without numbers.

Logic Model


Logic Model for placement
Download a full-size pdf here.

Logic Model Notes:




The placement program is a new approach to matching teacher candidates and cooperating teachers for the requisite internship during the undergraduate education program.

Inputs for the placement program include money for personnel and development of the website. As this is a new system that makes a much heavier use of technology over the previous offering, many of these costs are new this year for the internship matching process.

Outputs consist primarily of the activities involved with development and deployment of the program website. After that, the new system is advertised to students, teachers, and school division administration.
Outcomes for the program fall into three general streams. First, changes to the existing survey are required, including a shift in focus for the program. Second, the internship matching process and parameters need to reflect the new focus. Finally, increased teacher participation is an important goal of the program.

Evidence suggests that factors contributing to a successful internship were not being addressed with the previous survey. The goal is to shift from a survey collecting simple demographic data to a survey that addresses more comprehensive demographics, interests, personality traits, and skills. This represents not just a more in-depth approach, but also a change in focus for the internship process.

The previous matching process was time-consuming and difficult as it was completed entirely by hand. The placement program aims to improve overall results by automating the internship matching. This required a website to collect the information, and the goal is to leverage the automated system to expand internship opportunities. Possible areas for expansion include interning in pairs, placement in locations where the student has family supports, and tailoring placements to accommodate future employment directions.

These two streams converge in a long-term goal to shift the focus of internship from a necessary step for certification to an opportunity for new teacher induction.

Traditionally, teacher participation in the internship program has often been a struggle. It is not uncommon to have more teacher candidates than cooperating teachers. In those instances, university administrators have to petition teachers to take an intern. A goal of the placement program is to increase teacher participation through both new teacher intake, and repeat teacher participation. The long-term goal is to have a supply of cooperating teachers that outnumbers the demand by teacher candidates.

This program represents a new direction for the university. Furthermore, it represents a direction not currently taken by other institutions. A long-term goal of the program is to license the system to other institutions for tasks such as teacher internship placement, medical internship placement, and facilitating first-year teacher/mentor-teacher relationships in a school division.





Thursday, 3 April 2014

Evaluation Assessment

Engage Stakeholders:


Who should be involved?


Students, teachers, university administrators, and school division administrators.



How might they be engaged?


University administrators will be primarily engaged through collaboration during the evaluation. They will also be the primary recipients of evaluation reports. The primary method of engagement for students and teachers will be through an online survey. School division administrators comprise an important piece of the puzzle for the placement program. They will generally only be involved in the evaluation through summary reports of results.


Focus the Evaluation:



What are you going to evaluate?


The new intern placement program at the University of Saskatchewan will be evaluated. Intern placement matches Teacher Candidates entering their last year of a Bachelor of Education degree with Cooperating Teachers in the Saskatchewan education system for a four month internship. The new program will automate the matching process through an online survey (hereafter referred to as the placement survey) and a quantitative matching algorithm. More information is available in the logic model.



What is the purpose of the evaluation and what questions will it answer?


The placement program represents a new approach to matching students with cooperating teachers. As such many elements of the program are being attempted for the first time. The evaluation will assess the success of the new intern matching system. Three components of the program will be evaluated. First, the content of the survey needs to be examined. Second, the success of the web-based delivery method will be assessed. Finally, the success of the matching algorithm should be measured. The results of the evaluation will be used to improve the system for next year’s deployment.



Who will use the evaluation?


University administrators:
This group will be the primary focus of the evaluation. They will use evaluation results to improve the placement program for the next year.

School division administrators:
Evaluation results can be used to reassure school division administrators that this is a successful program. If directors, superintendents, and principals support the system, more teachers may be encouraged to join the program in subsequent years.

Teachers:
Successful internship placements are more likely to increase repeat-teacher participation. Additionally, positive word of mouth from participating teachers may increase overall teacher participation.

Students:
Students should be reassured that their voice is being heard during the placement program. Additionally, the evaluation results can be used for new students entering the program to reduce the uncertainty and stress inherent to an internship placement.


What information is needed to answer the evaluation questions?


Placement Survey content:
This will be evaluated by analyzing the placement survey responses through a combination of statistical methods including exploratory factor analysis. This evaluation question will require access to the results database.

Web-based delivery:
This question will require feedback from people who actually used the system during the program. This feedback will be obtained through a short online survey (hereafter referred to as the delivery survey).

Matching success:
An online survey will be used to assess the success of the internship placement (hereafter referred to as the matching survey). Additionally, focus groups and interviews with students and teachers may be included. The matching survey should be separate from the delivery survey, due to the large disparity in time between filling out the survey and completing the internship.


When is the evaluation needed?


The placement program runs in two stages during the year. Initial student and teacher data collection happens in March and April, and internship matches are evaluated in June. The internship runs from September to December.

This evaluation cannot be completed in full until early 2015 after students have finished the internship placement in December of 2014. The results will be used to improve the program for its 2015 launch in March.



What evaluation design will be used?


This will be primarily a process evaluation. The question assessment and the delivery survey fall firmly within the bounds of a process evaluation. The matching survey and the interview process lie in the grey area between a process evaluation and a summative evaluation. The actual alignment of those evaluation components will be determined by the focus of the questions in the survey and interviews.

Summative evaluation will be more useful in future years when progress towards long-term goals becomes a greater focus. This evaluation design is envisioned as being part of the yearly deployment cycle for the placement program. As such, a needs assessment would likely be useful in the future. However, it is outside the scope of the current evaluation plan.



Collect the Information:



What sources of information will be used?


Existing information:
The database of survey responses will be used for the statistical analysis.

People:
Students and teachers who participated in the placement program will comprise the primary source of information for this evaluation.


What data collection methods and instrumentation will be used?


Surveys will form the primary data collection method. There will be two surveys and both will be delivered online. The infrastructure for deployment of the surveys is already available via the placement.usask.ca website. The surveys will need to be written, but can make use of some open-ended questions used during pilot tests of the placement survey questions. The surveys will be promoted to everyone who took part in the placement program, although participation the in follow-up surveys will be completely voluntary.

Existing data will be used to evaluate the placement survey questions. I currently have access to the database of placement survey responses. Furthermore, a program has already been written to convert database results to a format readable by SPSS for statistical analysis. Data from all of the participants in the placement program will be used during this analysis.

Focus groups and interviews may take place following the internship. Access to students for interviews and focus groups should be relatively easy due to being on campus. Access to teachers would require additional planning. The purpose of the interviews is twofold. First, to get feedback on the success (or failure) of the internship matching process. Second, to generate suggestions for future improvements. This method would be best employed after the survey results have been collated to allow for deeper inquiries into trends and factors that come out of the survey results. Participants for interviews and focus groups will be chosen from teachers and students who completed the placement program.



What is the timeline for data collection?


The delivery survey should happen shortly after all the data has been collected. This will likely be in June or July of 2014, after internship matches have been calculated.

Statistical analysis of placement survey questions can happen any time after the initial data has been collected. This will happen in July or August of 2014, after internship matches have been calculated.

The matching survey should happen right at the end of the internship in December of 2014. This coincides with the end of the yearly cycle for the placement program.

If needed, interviews and focus groups would happen after the results of the matching-success survey have been collated, in January and February of 2015.




Analyze and Interpret:


How will the data be analyzed and interpreted?


The primary analysis will be statistical in nature. Survey results will be assessed to examine trends and deviations among respondents. Placement question assessment will use a variety of statistical techniques, primarily exploratory factor analysis.

Open-ended questions will be assessed using qualitative techniques to extract themes and factors. These themes can be used to direct the interviews and focus groups.

The evaluator will be responsible for stages of the evaluation analysis. Additional help may be requested to conduct and process interviews and focus groups. 



What are the limitations?


The goal of the evaluation is to improve the placement program for its second year of operation.

As this is a new evaluation of a new program, there are many limitations. First among those is a lack of person resources. The evaluator (me) is also the primary software developer and collaborating content developer for the main program. Keeping the evaluation components distinct from the main program components will be a challenge, as will splitting time between the two. The benefit of this tight integration is full access to program data and infrastructure for survey development and deployment.

Because the placement program is in its first year, the yearly deployment cycle is very fluid. Additional features are being added regularly and timelines often shift. Planned evaluation stages may be obsolete before they have a chance to be implemented.

Survey participation will be voluntary. The first survey will be deployed in the summer and the second survey will be deployed shortly before Christmas. This timing could result in small numbers of respondents.

This evaluation process is intended to become part of the yearly schedule for the placement program. Therefore, the timing for deployment of evaluation components and analysis of results will be constrained by regular program operations. Additionally, since the placement program is expected to change and evolve from year to year, the evaluation components will likewise need to evolve.




Use the Information:


How will the evaluation be communicated and shared?


The primary recipient of the evaluation results will be university administrators. The results will be used to aid decision making about changes in implementation for the following year. The results will be communicated through reports and meetings.

Teachers and school division administrators will be informed of results at teacher’s conventions and conferences such as the National Congress on Rural Education. One of the goals of the placement program is an increase in the available teacher pool to a size that exceeds the demand from students within the College of Education. Dissemination of the successes of the program directly to educators is an essential component of this process.

Some of this data will be used during information sessions for new students who are preparing for the internship.

Summary data reports will be made available through the placement.usask.ca website. It is anticipated that these will be primarily used by students entering or leaving the program, and secondarily by teachers entering or leaving the program.




Manage the Evaluation:


Human subject protection.


Although evaluation surveys will be accessed through the placement.usask.ca website, they will be deployed in a separate section from the placement survey. No login will be required and no user data will be collected for these surveys.

Question analysis is conducted based on existing data, which contains identifying demographic information. This information will be ignored and a random number will be assigned to each participant to preserve anonymity.



Timeline.


The survey on the web-based delivery and the question analysis will happen during the early summer after the first stage of the placement program has been completed. This will likely be June and July of 2014.

The final survey will be deployed in December of 2014 at the end of the internship.

Interviews and focus groups would be conducted as needed in January and February of 2015.



Responsibilities.


The evaluator will be responsible for the deployment of instruments and the collection of all data with the following exceptions:

Survey questions will be written in consultation with university administrators to ensure that appropriate target areas are addressed.

Additional help may be requested for interviews and focus groups. This will depend on the number of interviews and focus groups that are required.



Budget.


The primary cost associated with this evaluation is time. It will take time to create and deploy the surveys. Time to collect and summarize the results. Time to conduct statistical analyses. Time to disseminate the results.

The only major equipment cost associated with this evaluation is a computer for development of content and analysis of results. Proprietary software requirements include an office suite (Microsoft Office) for report composition and a statistical package (SPSS) for analysis.

Additional costs for interviews may include a support person to assist with running interviews and collating results, and a recording device to use during interviews.

A general pool for miscellaneous expenses will also be included.




Standards:



Utility.


Every effort will be made to ensure that the data from the evaluation is effectively used. This will be aided by involving university administrators in the planning process to better aim surveys at evaluation questions.



Feasibility.


As the infrastructure for most of the evaluation components currently exists, the proposed evaluation should be feasible. The primary investment will be the evaluator’s time. As this is intended to be an integral part of the yearly deployment cycle for the program, the cost of that time should be allowed.



Propriety.


This program represents a new direction for the College of Education. As such, it is imperative to know that the program is working. New programs can see substantial changes in the first few years as unforeseen deficiencies are improved. This evaluation represents a genuine attempt to make the placement program better.



Accuracy.


The evaluation is designed to address specific areas of the program. A targeted evaluation component will be developed to address each of the three evaluation questions. Statistical methods pertinent to the evaluation components will be applied along with general summary statistics.

Saturday, 8 February 2014

Program Evaluation Model Selection

For this program I suggest a primarily goals-based evaluation focus (as defined by McNamara, 2002) within the framework of a case study evaluation model (as defined by Stufflebeam, 2001). I see goal specification as conspicuously absent from the information provided in the program document. It talks at some length about incidence of, and risk factors for, gestational diabetes mellitus (GDM).

However, there are no statements regarding targeted outcomes of the program. Is the goal of the program decreased incidence of GDM, increased education about risk factors, better overall health, establishment of a social support network?

Somewhat tangential to goals-based evaluation is the question of participation level. A very high percentage of inquiring women participated (69%), but a very low percentage of the estimated population inquired (11%). Were there resources in place to accommodate 69% participation at the population level? That could potentially be 300 women (plus buddies) looking for child care, bathing suits, food, exercise space, and so on. The participation aspect starts to move the evaluation towards process-based, but I believe it ties into the goal of servicing the target community.

The main goal of a case study model (Stufflebeam, 2001) is to “delineate and illuminate” (p. 34) a program rather than directing or assessing its worth. It's strengths include incorporation of various stakeholders and multiple methodologies. Perhaps most importantly in this application, a case study is a natural fit for a focused program evaluation. Its primary limitation is a direct offshoot of that strength. It is not well suited for a whole-program evaluation. As that is not a requirement in this situation, the strengths of this model greatly outweigh its limitations. 

References 

McNamara, C., 2002. A Basic Guide to Program Evaluation. Retrieved from the Grantsmanship Center website: http://www.tgci.com/sites/default/files/pdf/A%20Basic%20Guide%20to%20Program%20Evaluation_0.pdf 

Stufflebeam, D. L., 2001. Evaluation Models [Monograph]. New Directions for Evaluation, 89, 7-98. doi:10.1002/ev.3

Saturday, 1 February 2014

Program Evaluation Assessment

The Upward Bound Math and Science initiative (UBMS) was established in 1990 within the scope of the existing Upward Bound (UB) program. It was created, within the United States Department of Education, with the goal of improving the performance and enrollment of economically disadvantaged K-12 students in math and science. Additionally, it sought to address underrepresentation of disadvantaged groups in math and science careers. These goals were pursued through funding of UBMS projects at Colleges and Universities that provided hands-on experience in laboratories, field sites, and with computers; and through summer programs targeted towards university preparation in math and science. UBMS grew from 30 programs in 1990 to 126 programs in 2006, servicing 6707 students at an annual cost of $4990 per student.

There have been evaluations of UBMS, in 1998, 2007, and 2010. My assessment will focus on the 2010 evaluation. The 2010 report combined data from 2001-2002 and 2003-2004 with a focus entirely on postsecondary outcomes. Each subsequent evaluation builds on previous work, which represents the first difficulty that I had with this report. Many times old data was presented for comparison to new data and, in some situations where new data was not available, old data was incorporated into the analysis. Without very careful reading it was often quite difficult to distinguish what was being reported.
This was a purely summative, outcomes-based evaluation presented exclusively from a quantitative, statistical perspective. The summative, quantitative approach is effective in justifying the cost per student as it provides hard numbers to compare against hard numbers. The method does not match up particularly well with any specific program evaluation model, although it comes closest with goal attainment as defined by Tyler as the purpose of the report is to connect the objectives of UBMS with measurable outcomes.

There were two components to the evaluation. A descriptive analysis of survey information about the credentials and demographics of staff, student recruitment and enrollment, program offerings, and other demographic aspects. This analysis was not the focus of the study and was used solely for descriptive purposes. The bulk of the study consisted of an impact analysis where outcomes of UBMS were measured against participants in the regular Upward Bound program rather than the general populace. This enabled measurement of the additional success provided by UBMS over and above any success reported by UB. However, since the successes of UB are not enumerated in this evaluation, the numbers presented are somewhat misleading as they are primarily relevant in comparison to data that are missing.

Even with this caveat the results are impressive. The evaluation found statistically significant increases in enrollment at four-year institutions (12%), enrollment at selective institutions (18.6%), enrollment in math and science courses (36.5% more credits completed), and social science degree completion (11%). There was also an increase in completion of math and physical science degrees, though it was not statistically significant. Overall, despite some minor data and analysis issues, the UBMS evaluation paints a clear picture of the successes of the program.