Statistical Significance

Beka · Post by **Beka** » Wed Mar 05, 2008 3:56 pm

Hello. I am doing a high school project for the Washington State Science & Engineering fair. I would like to be competitive for ISEF and therefore need to show statistical significance for my survey results.

I have been researching how to get the p-values needed and think I have found what I need. However, I think my hypothesis included too many manipulated variables.

The hypothesis is: Music that is loud or has a strong rhythmic beat playing in a store will increase customer’s frustration levels and distract them from their shopping plans when they are in a hurry or have 10+ items to purchase.

I was hoping to lump ‘loud music’ with ‘strong rhythmic beat’ since I think they have the same effect.

Also, I was hoping to lump ‘hurried’ with ‘10+ items’ since I think they have the same effect.

I wanted

(Loud and/or Strong beat) vs. ( quiet and/or soft beat) when (hurried or 10+ items)

The p-value for these results were great!

However, is that legal? Can I make group one contain two different shopping experiences? (Loud music and/or Strong beat) and group two contain two different shopping experiences ( quiet and/or soft beat)? Or are there too many variables?

Is this really four tests?

Loud music volume vs. low music volume when hurried (type of music may effect this score)

Strong rhythmic beat vs. soft beat when hurried (we believe the volume of the soft beat may effect this score)

Loud music volume vs. low music volume when buying 10+ items

Strong rhythmic beat vs. soft beat when buying 10+ items

The data from over 100 surveys showed that the strong rhythmic beat or loud music did have a strong effect on the frustration levels even when the shoppers were not in a hurry or preplaning to buy 10+ items. So that part of my hypothesis was more narrow than necessary.

Then there is the stinky part “and distract them from their shopping plans”. It can be argued that if someone is frustrated they are by definition distracted. However, only 50% said they felt they were distracted in a yes/no question. If the hypothesis said “Or distracted….” or if it said “and increase their distraction levels…” I could work with it. The p-values showed significance for the increase in the number of shoppers who said they were distracted while shopping with rock music. But the hypothesis appears to say 100% of those frustrated by rock music would be distracted (What was I thinking?). Do I say my hypothesis is partially correct? And then explain in the analysis that rock music did increase distraction in a statistically significant amount even though it did not cause the all of the shoppers to be distracted from their shopping plans.

Thanks,
Beka

deleted-2131 · Post by **deleted-2131** » Fri Mar 07, 2008 1:31 pm

Beka,

You raise some very important issues in your post. Thank you for being so thorough! One of the first questions that comes to mind regards the type of statistical analysis you are doing. Since you say you have p-values, I'm assuming that you are using a hypothesis test (e.g. a t-test, a Chi-squared test, etc.) I'm assuming from what you've said that you used a t-test.

Using a t-test assumes that you have quantitative variables (e.g. variables that can be measured with numerical values). From what you've said, it appears that you had those people who took your survey rated their frustration level on a scale (e.g. "on a scale of 1 to 10, to what extent did you feel ______ when shopping?" Of course, there are issues associated with surveys themselves and the wording of your questions can significantly bias your results.

One question I have is how you quantified “loud music with strong, rhythmic beat.” Since you say you did a survey, I’m assuming that the subjects all rated the music they heard during a shopping experience as “loud” or “having a strong rhythmic beat.” Did the subjects answer these as two separate questions? Or were these terms lumped together in one question? If they were put together in one question, it is probably valid to “lump these together.” However, if the subjects responded to these criteria as two separate questions, it makes no sense whatsoever to lump them together. Just because you think they have the same effect doesn’t mean that they do and if the subjects treated “loud” and “strong, rhythmic beat” as two separate things, then you should, too. The same goes for “hurried” and “10+ shopping items.”

As far as your analysis goes, you must identify your null and alternative hypotheses for each test you do. And it makes absolutely no sense to calculate any sort of p-value without performing exploratory data analysis.

Another thing you need to be careful about is your wording. No matter how good your survey is, you cannot say that the music had an effect on the shoppers. You have no evidence that the music actually caused the observed effect on the shoppers. All you have is an observed relationship, which is called a correlation. You can say that shoppers who responded by saying that they were bothered by the loud music tended to be more frustrated, but you cannot then infer that the music actually caused them to be frustrated.

A lot of what you do here really depends on your data. If you could post a copy of the survey you used, we could probably be able to help you more. And it would also really help if you could elaborate on what sort of exploratory data analysis you did (graphs, calculating averages, standard deviations, etc). and what hypothesis testing methods you used.

Looking forward to your response!

Beka · Post by **Beka** » Fri Mar 07, 2008 6:13 pm

Terik Daly,

Thank you so much for your help! I only met with my mentor once and realize now some of the things she said about narrowing down my study.
Below are the questions of my survey that I am focusing on.
1. While shopping were you: Leisurely 1 2 3 4 5 6 7 8 9 10 Hurried
2. How many specific items did you pre-plan to purchase? 0 1-9 10-19 20+
3. What type of music was playing in the store?
Soft Rock Rock Classical No music Other ______________
4. Was the Music: Very quiet 1 2 3 4 5 6 7 8 9 10 Very loud
5. Was your shopping trip enjoyable?
Enjoyable 1 2 3 4 5 6 7 8 9 10 Frustrating
6. Did you enjoy the music that was playing?
Enjoyable 1 2 3 4 5 6 7 8 9 10 Irritating
7. Do you feel the music distracted you from fulfilling your shopping plan? Yes No

I am using a website that we found to calculate the statistics and am not sure if I am using it correctly.
http://faculty.vassar.edu/lowry/odds2x2.html

I am a home school student, so I do not have easy access to a statistics teacher. My mom has a BS in math from 20 years ago, but never applied statistics to surveys. A previous high school math teacher we know said I should convert the 1 to 10 answers into absent/present.
For example #5. Was you shopping trip enjoyable? 1 to 5 = absence of frustration, 6 to 10 = presence of frustration.
However, he couldn’t remember how to apply the formulas for chi-square.
If you have any resources you can suggest that explains how to do chi-squares, I would appreciate it.

I’m not sure what you mean by identifying the null and alternative hypotheses for each test and performing exploratory data analysis. Please let me know if I did those things in the explanation below.

Again my hypothesis was: Music that is loud or has a strong rhythmic beat playing in a store will increase customer’s frustration levels and distract them from their shopping plans when they are in a hurry or have 10+ items to purchase.

I have 138 surveys. By saying in my survey that the frustration and distraction would be increased for those who were in a hurry or had 10+ items to purchase, I narrowed down the number of usable surveys to about 50. However, if I look at all of the surveys, regardless of how hurried they are or how many items they want to buy, the results are even stronger.

Below are some of my results:

For those Hurried (6 to 10)

Rock (strong rhythmic beat) 3 absence of frustration, 9 presence of frustration
Not rock 26 absence of frustration, 14 presence of frustration
Yates Chi-square P < .04

Rock 6 absence of distraction, 6 presence of distraction
Not rock 35 absence of distraction, 2 presence of distraction
Fisher exact probability test p< .01

High volume 4 absence of frustration, 12 presence of frustration
Low volume 28 absence of frustration, 14 presence of frustration
Yates Chi-square p<.o2

High volume 10 absence of distraction, 6 presence of distraction
Low volume 37 absence of distraction, 2 presence of distraction
Fisher exact probability test p < .01

Is it correct to say:
There was a strong correlation (p<.04) that subjects in a hurry, who said the music in a store was loud, tended to be more frustrated than those who said the music was quiet.
There was also a strong correlation (p<.02) that subjects in a hurry, who described the music as ‘rock’, tended to be more frustrated then subjects who described the music as ‘soft rock’, ‘classical’ or ‘no music’.

Can I use the fisher exact probability test p values?

The results for 10+ items were similar, but the p values were not as good. The Chi-square is calculated only if all expected cell frequencies are equal to or greater than 5 (what ever that means!) I also don’t know what they mean by “probability estimates are non-directional”.

When I used all 138 of my surveys (whether or not they were in a hurry or had 10+ items to purchase) the results were:

Rock 9 absence of frustration, 16 presence of frustration
Not rock 74 absence of frustration, 20 presence of frustration
Yates Chi-square P < .001

Rock 12 absence of distraction, 13 presence of distraction
Not rock 81 absence of distraction, 6 presence of distraction
Fisher exact probability test p< .000002

High volume 10 absence of frustration, 20 presence of frustration
Low volume 83 absence of frustration, 19 presence of frustration
Yates Chi-square p<.0001

High volume 17 absence of distraction, 12 presence of distraction
Low volume 89 absence of distraction, 6 presence of distraction
Fisher exact probability test p < .00003

Is it correct to say:
Furthermore, there was a stronger correlation (p<.001) that all subjects who said the music in a store was loud, tended to be more frustrated.
There was also a stronger correlation (p<.001) that all subjects who described the music as ‘rock’, tended to be more frustrated then subjects who described the music as ‘soft rock’, ‘classical’ or ‘no music’.

Thanks,
Beka

deleted-2131 · Post by **deleted-2131** » Fri Mar 07, 2008 7:19 pm

Beka,

I will look over what you have posted this evening and post back with comments/suggestions, etc. Thanks for being so thorough and providing so much information. It will make helping you much easier!

deleted-2131 · Post by **deleted-2131** » Fri Mar 07, 2008 8:23 pm

Beka,
I’m going to start with some basic statistics information to give you some background before I make direct comments on your post.
You are doing what is called an observational study. You are observing people and measuring variables that you are interested, but you are not (hopefully) influencing their responses.
One of the first questions to ask yourself is how you got your data. Did people volunteer to do your survey? Did you do your survey on shoppers from the same store? Did you only survey a single gender? Did you survey a single age group? Did you just survey those people that were easiest to reach? Or did you select the people who took your survey randomly? Did you have people who you asked to take your survey who would not do it? Did you ask the people who took your survey to recall a shopping experience that was a while ago, or did you have them do the survey right after they finished shopping? Did the people who took your survey do it at the same time of day? Did any of your questions “plant” an idea in the respondent’s mind that may not have been there before?
Thinking about questions like these will help get a sense of how reliable your data is. One of the important things in statistics is defining your sample population. The sample population is the all the people who you are interested in. It might be teenage girls living in Ohio, it might be 35-40 year old men with salaried jobs, etc. You need to define what your sample population is. To some extent, this will have to be based on the people that you surveyed and how you decided that they should do your survey.
Once you have gathered your data, you want to first do what is called exploratory data analysis. Exploratory data analysis involves looking at your data for trends and patterns. In your case, you might want to make a table with the counts of how many people gave each answer to every question in your survey (e.g. for question 1, 20 people put “1”, 17 people “2”, 8 people “3”, etc.) Go through and do this for each question. You may then want to express these numbers as percentages (e.g. 7% of people said “1”, 9% of people said “2”) and put this information in a table.
Now, look at each variable separately. (Look at the responses to each question before trying to compare two questions.) A bar graph will be helpful in looking at the responses for each question. Put the various categories (the options that the respondent had to choose from) on the x-axis and put the count or percent on the y-axis. Looking at the responses to each question helps you know what sorts of comparisons you want to make.
Once you have looked at the data from each question individually, you can start to compare the responses to the questions. You want to look at the responses from ALL the surveys, not just 50.
Converting the answers to present/absent is one way of going about looking at your data, but since you only asked one yes-no question (present-absent) I’m not sure this would be the best way to go about your analysis.
Looking at your questions in terms of your hypothesis, one thing I notice is that you use the phrases “loud” and “strong rhythmic beat” in your hypothesis, but your survey questions have the respondents rate “loudness” on a scale from 1 to 10 and doesn’t ask about whether the music has a strong, rhythmic beat. So, in order to look at your hypothesis, you are going to have to define “loud” in terms of the options your respondents had. In this case, you will have to decide what numeric values are “loud” and what numeric values are “not loud.” This is similar to the absent/present reasoning.
The more trying issue is the phrase “strong, rhythmic beat.” Nothing in your survey asks whether the music has a strong rhythmic beat. How were you planning to define this? You ask about what type of music was playing and give several options (Soft Rock Rock Classical No music Other _____ ) but I know of classical music that has a strong, rhythmic beat and I know of rock music that has a strong rhythmic beat. Do you see the issue here? Another issue is that these categories are not well defined. You might consider something rock that I would think is soft rock, and I might call something classical that you might call rock, because our definitions of these categories probably differ.
Once you have figured out how you are going to define these two categories (these definitions will be purely contrived for the purposes of your project, so you will need to be careful about comparing your results to others) you can then start to look at the patterns in response to these two variables using table and graphs.
This same approach will need to be done for “frustration level” and “distracted” and “in a hurry.” Perhaps the absent/present definition is the way to go, but that decision should only be made after examining the responses to the individual questions as given on the survey. Luckily the 10+ items is clearly defined in your survey, so this definition has already been made.
Once you have looked at the response distributions for each of your survey questions and have defined your variables of interest in terms of the responses to your survey, you can then use tables and graphs to look at counts and percents of you variables of interest, each individually, and then you can start comparing them.
Once you have done this exploratory analysis of your data (notice that we haven’t calculated any p-values yet), you can then start to decide how you want to go about performing inference.
I know this is a lot to take in, so I won’t go any further until you have had a chance to read over what I’ve said, answer the questions I have asked, and done exploratory data analysis. Post back with your findings and with questions, etc. that you have. If anything is confusing or doesn’t make sense, please let me know so that I can help you.

Beka · Post by **Beka** » Mon Mar 10, 2008 12:18 pm

Terik,

Thank you for the explanation. Statistical analysis is starting to make more sense now.

Those surveyed were friends of our family and their friends. I sent surveys by e-mail and handed them out. I asked about 150 people and 66 responded. All surveys were either handed back to me or mailed. No data was received by e-mail or internet. About 80% of the subjects are stay-at-home moms in Washington State. About 7 subjects were out of state. Subjects were asked to fill out the survey following a shopping trip. Some subjects filled out more than one survey to describe different shopping trips.

Type of music: Soft rock 42%, Rock 19%, Classical 2%, No music 24%, other 13%
Ages: 20-29 years old 3%, 30-39 18%, 40-49 46%, 50-59 25%, 60+ 8%
Gender: female 92%, male 8%
Number of People in subject’s household : 1 person – 1%, 2 people – 13%, 3 – 11%, 4 – 35%, 5 – 21%, 6 – 6%, 7 – 10%, 9 – 1%, 11 – 2%
Number of children with subject while shopping: 0 children – 60%, 1 child – 20%, 2 – 13%, 3 – 4%, 4 – 2%, 5 – 1%
There were 53 different stores visited by subjects. Fred Meyer- 26 surveys, Safeway -20, Albertsons – 9, Costco – 7, Bartell -5, Goodwill – 4, Winco – 3, Home Depot – 3, & 45 stores with 1 or 2 surveys.

These are the definitions I found for rock and soft rock.
Rock – “In its purest form, Rock & Roll has three chords, a strong, insistent back beat, and a catchy melody.” http://www.allmusic.com/cg/amg.dll?p=amg&sql=77:32
Rock - “A musical style derived in part from blues and folk music marked by an accented beat and repetitive phrase structure”. Webster's college dictionary.
Soft rock – “Rock & roll that is relatively melodic in style with an under emphasized beat.” Webster's college dictionary

Each subject choose from (Soft Rock Rock Classical No music Other___)
If a person heard rock they had the choice to call it ‘soft rock’ or ‘rock’. So, by the definitions above the beat should have been stronger in their opinion if they choose ‘rock’.

Can I say?
Since each subject had the choice between ‘rock’ and ‘soft rock’ for my survey, I believe they would choose ‘rock’ for music that they felt had a stronger beat then ‘soft rock’, therefore for my study I define:
‘Rock’ as a ‘strong rhythmic beat’
‘Soft rock’ and ‘no music’ as the absence of a ‘strong rhythmic beat’

I personally think classical does not have a strong rhythmic beat, however, if you think there may be some objection, then I can count it with the ‘other’ and not use that data for this part of the study. There were only 3 that choose ‘classical music’.

I defined quiet as volume level (1 to 5) and loud as (6 to 10).
I realize my hypothesis was not worded correctly to match the survey. My original question further complicates matters. “Does the type of music played at a store affect customer’s satisfaction and shopping performance?” I obviously needed more help in matching my question, hypothesis and survey. I used the words enjoyable & satisfaction as synonyms.
I do believe I have good results to show, but is it going to be a big problem that my wording is off? Or do I say that I partially proved my hypothesis and furthermore found these other correlations? Or am I asking these questions too soon?

Over the next few days I will do the exploratory data analysis for all of the surveys together. I guess I was doing this backwards, since I split them into the categories first and then analyzed the data.

Thanks,
Beka

deleted-2131 · Post by **deleted-2131** » Mon Mar 10, 2008 3:37 pm

Beka,

This is really wonderful. Your starting to appreciate some of the complexities of statistical analysis and are beginning to really look at your data. This is excellent! This kind of scrutiny and detail is what will get you to ISEF.

NOTE: THIS IS VERY IMPORTANT! If you go to ISEF, you must be in compliance with ALL ISEF rules. If you are not compliant you will be disqualified from the fair. They are very, very, very strict about these things. One of the rules about doing research on people is that 1) if you are giving a survey to people, the survey must be IRB approved and 2) you must a signed informed consent form for every person who participates in your study. If your regional fair (which I'm assuming is Intel ISEF-affiliated) did not discuss these rules with you, please contact the SRC director for your fair to make sure that you are in compliance with all ISEF rules. If you have already done these things, that's wonderful; if you haven't please, please let me know and please contact your regional fair SRC so that we can help you be ISEF compliant.

Since you are working with surveys of people, it is very important that when you talk about how you collected your data that you say how you chose who took your survey. It sounds like YOU chose who took your survey by asking people you knew and the friends of people you knew. This is important. One of the things that is really difficult about doing research using surveys is that you are never quite sure if the data your getting is representative of the whole population. This concept is a central idea in statistics. When we do an experiment or a survey, what we are really trying to do is to collect sample data and then use that sample data to draw conclusions about the entire sample population.

Here's an example: Pretend you are an apple grower selling thousands and thousands of apples to different grocery stores. The grocery stores don't want to buy rotten or wormy apples, so you want to make sure that you are only selling good apples. Since you are selling so many apples, it wouldn't be practical to check every single apple to see if it is good. So instead, you only check a few apples and from those few apples you infer the status of all the apples. For instance, you may sample 5 different apples from each bushel and if those 5 apples are good, then you say that the whole bushel is good. But if those 5 apples are all bad, you say the whole bushel is bad. Do you understand the point I’m trying to make? We can’t know what is really going on in the population (the bushel of apples), and so we use a sample (the 5 apples) to draw conclusions about the population.

What you are doing is trying to figure out how all people respond to music while shopping by looking at how a sample (the people you survey) responds to music while shopping.

But now comes the tricky part. How do we know that our sample is representative of the entire population? For example, if we are trying to figure out what percentage of the population has pink as their favorite color, but we only survey 5 year old girls, our sample will have a much higher percentage of people whose favorite color is pink than will the population, which includes everybody of all ages and genders and races and everything else you can think of. Similarly, if we wanted to know how many rabbits live in the state of New York, but only count the rabbits that live in Times Square, our sample won’t be representative of the whole population of rabbits in New York. Are you starting to understand this point?

When we have a sample, we have no way of knowing whether it actually is representative of the population. We hope it is, but we just can’t be sure because we hardly ever know the whole population. So, one of the things that we want to be very careful with when we are doing a survey is to try to pick a sample that is representative of the population. But in order to do that, you must first define what your population is.

For example, in your survey, you surveyed your family friends and your friends’ friends. But what if your friends aren’t representative of the population. For instance, your survey doesn’t include any teenagers. Do you think it is possible that teenagers might respond to music while shopping differently than people who are 40-49 years old? It’s definitely possible, but we don’t know for certain because we don’t have any data on them. If a sample is consistently not representative of the population, the sample is said to be biased. You need to decide what the population is you are trying to study (e.g. you are only interested in females ages 20 to 60) and clearly define that. You then need to decide if your sample is biased in any way. Let me know how you want to define the sample population and let me know if you think that your sample has any bias in it, and then I will share my thoughts on this with you.

Here are some more things to think about:
1. Did your survey included the definitions of rock and soft rock that you’ve given here? Did you give any definition for any of the music categories on your survey, or did you let your subjects decide for themselves what kind of music they were hearing?
2. My point with the whole classical music thing is not that your definition is wrong, my point is that I might define a category of music differently than you would. And your mother might define it differently again. I’m trying to get you thinking about possible problems with your survey. What I consider to be, you might consider to be soft rock rock, and so forth.
I’ve got to run, but hopefully this gives you some things to think about.
Keep up the good work! Let me know if anything is confusing or if you have more questions or anything.
You really are doing a wonderful job!!!

Beka · Post by **Beka** » Wed Mar 19, 2008 2:06 pm

Terik,

Yes, I did get IRB and SRC approval before I gave out my survey. It took about a month. My IRB said that informed consent was recommended but not required. I have about 98% of my consent forms. Do I need to get the last few?

I personally sent out my survey to almost everyone whose e-mail address I could get a hold of. Most of those on my original list are home schoolers. However, I personally know only about a fourth of the people who filled out my surveys.

Should I remove the surveys that were filled out by men (8%), & those outside of Washington State & those under 30 years old (3%) to narrow down my population? Should I say?
My population is females in Washington over 30 years old with a home school bias.

No, I did not include definitions in my survey. I let the subjects decide for themselves if they would classify what they were hearing as ‘rock’ or ‘soft rock’. Is it okay to assume that since they had a choice between ‘rock’ and ‘soft rock’ then they considered the beat to be stronger if they choose ‘rock’?

Here is the analysis information you recommended I look at. The data is in bold. I am on spring break vacation with my family and do not have the information for #4, so I made a guess. I just put percentages for most of them. There are 135 surveys. Let me know if you need the numbers at this point as well.

Shopping & Music Survey

1. Age: 20-29 30-39 40-49 50-59 60+
20-29 years old (3%), 30-39 years old (18%), 40-49 years old (46%), 50-59 years old (25%), 60+ (8%)

2. Gender: Male Female
female (92%), male (8%)

3. Name of store: ________________________________
There were 53 different stores visited by subjects. Fred Meyer- 26 surveys (19%), Safeway -20 (15%), Albertsons – 9 (7%), Costco – 7 (5%), Bartell -5 (4%), Goodwill – 4 (3%), Wal-Mart – 4 (3%), Winco – 3 (2%), & 45 other stores with 1 or 2 surveys.

4. City, State: __________________________________________
I think about 94% are from Washington state.

5. Number of different items purchased: 1-9 10-19 20+
1-9 (59%), 10-19 (25%), 20+ (16%)

6. Number of people in your household:_____________________
1 person – (1%), 2 people – (13%), 3 people – (11%), 4 people – (35%), 5 people – (21%), 6 people – (6%), 7 people – (10%), 9 people – (1%), 11 people – (2%)

7. Number of children with you while shopping:_______________
0 children – (60%), 1 child – (20%), 2 children – (13%), 3 children– (4%), 4 children – (2%), 5 children – (1%)

8. Time spent shopping in the store: 1-15 minutes 15-30 minutes 30+ minutes
1-15 (29%), 15-30 (41%), 30+ (30%)

9. While shopping were you: Leisurely 1 2 3 4 5 6 7 8 9 10 Hurried
1 (13%), 2 (5%), 3 (10%), 4 (8%), 5 (20%), 6 (13%), 7 (7%), 8 (11%), 9 (5%), 10 (8%)

10. How many specific items did you pre-plan to purchase? 0 1-9 10-19 20+
0 (4%), 1-9 (66%), 10-19 (20%), 20+ (10%)

11. Did you use a written shopping list? Yes No
Yes (46%), No (54%)

12. About what percent of items purchased were not pre-planned? _______________
0% (45%), 1-9% (9%), 10-19% (9%), 20-29% (10%), 30-39% (4%), 40-49% (2%), 50-59% (12%),
60-69% (2%), 70-79% (1%), 80-89% (2%), 90-99% (2%), 100% (3%)

13. Did you remember everything you came for? Yes No
Yes (92%), No (8%)

14. What type of music was playing in the store?
Soft Rock Rock Classical No music Other ______________
Soft rock (43%), Rock (19%), Classical (2%), No music (24%), other (12%)

15. Was the Music: Very quiet 1 2 3 4 5 6 7 8 9 10 Very loud
1 (25%), 2 (10%), 3 (20%), 4 (10%), 5 (12%), 6 (5%), 7 (7%), 8 (8%), 9 (2%), 10 (1%)

16. Did the music make it difficult to remember every thing you wanted to purchase? Yes No
Yes (12%), No (88%)

17. Was your shopping trip enjoyable?
Enjoyable 1 2 3 4 5 6 7 8 9 10 Frustrating
1 (21%), 2 (12%), 3 (17%), 4 (10%), 5 (11%), 6 (10%), 7 (7%), 8 (7%), 9 (4%), 10 (1%)

If not, was any frustration related to the music that was playing? Yes No
Yes (21%), No (79%)

18. Did you enjoy the music that was playing?
Enjoyable 1 2 3 4 5 6 7 8 9 10 Irritating
1 (16%), 2 (6%), 3 (10%), 4 (9%), 5 (20%), 6 (5%), 7 (7%, 8 (7%), 9 (8%), 10 (12%)

If not, did irritating music cause you to hurry and leave the store quicker? Yes No
Yes (21%), No (79%)

19. Do you feel the music distracted you from fulfilling your shopping plan? Yes No
Yes (15%), No (85%)

20. What type of music do you feel is best for a GROCERY shopping environment?
Soft Rock Rock Classical No music Anything
Soft Rock (30%), Rock (2%), Classical (39%), No Music (25%), anything (4%)

21. What music volume do you think is ideal for a GROCERY shopping environment?
Very quiet 1 2 3 4 5 6 7 8 9 10 Very loud
1 (25%), 2 (14%), 3 (35%), 4 (19%), 5 (7%), 6-10 (0%)

Thank you again for all of your help. Please let me know what you think.

Beka

deleted-2131 · Post by **deleted-2131** » Fri Mar 21, 2008 8:36 pm

Beka,

I am SO glad to hear that you got SRC and IRB approval. Whew! If you have most of your consent forms, I wouldn't worry too much about getting the rest, especially if your IRB said informed consent was optional.

From what you've said, it sounds like you chose who took your survey by choosing those people who were easy to get a hold of. It doesn't really matter whether you know the people who took your survey; the important thing is that you used a method called convenience sampling. Convenience sampling is, by definition, choosing those individuals who are easy to reach. The problem with convenience sampling is that the results produced by this method of gathering data are often biased. The reason that these types of samples are biased is because the people who respond to your survey generally are not representative of the total population about which you are trying to perform inference. You can't know for certain if your sample if biased, but based on the sampling method you used ,it probably is.

So, what does this mean for you and your project. First, it is going to be difficult to generalize the results of your survey to the general population. So, we want to make it clear that we are not trying to do that. I think that restricting your samples to women in Washington state over 30 years of age is a good idea. Doing so narrows the population you are trying to estimate and brings your sample closer to the population you are trying to draw conclusions about.

Since you know that you have a bias towards the parents of people who are home schooled, it would probably be good to say that. I like the way that you've phrased it:

My population is females in Washington over 30 years old with a home school bias.

The really excellent thing about your project is that you are coming to realize the limitations of your survey. So many people at science fairs just interview their 10 closest neighbors and then try to tell the judges that "everybody" thinks a certain way or likes a certain thing. Just by starting to really look at the statistical issues associated with surveys is going to set you apart from the vast majority of human behavior/social science projects at your fair. What your dealing with is the very issue that organizations like the Gallup polls, the US Census, etc. are trying to deal with. (Of course, they have enormous resources that you don't have and sophisticated statistical methods). The important thing is that you are critically analyzing your data. This is SO important and SO often undone.

The best way to sample in such a way as to avoid bias is to allow impersonal chance to select those people who respond to your survey. This is quite easy to do in theory, but quite difficult to do in practice. A sample chosen by chance allows neither favoritism by the sampler nor self-selection by the respondents, both of which introduce bias. Choosing a sample by chance attacks bias by giving all individuals an equal chance to be chosen. (From The Practice of Statistics, by Yates, Moore, and Starnes).

One of the easiest ways to allow chance to choose who takes part in your survey is called a simple random sample (SRS). An SRS gives each individual in the population an equal chance of being chosen and also gives each sample an equal probability of being chosen. Of course, in order to do this you almost have to have a listing of everybody in your population of interest, so in reality this is not very practical, nor is it possible in your case. The important thing to know is that all of the statistical methods that you are and will be using assume that your data were collected using an random sampling method, such as an SRS.

So, now that we have defined our sample population (women in Washington over 30) and recognized that the sample is probably biased because our sample favored those who are involved in home school, the question is what to do next. We do not know for certain that there is a bias in our sample, but we are fairly sure that there is bias since we know our sample is not representative of the population of interest. The other issue is that we do not know the type of bias that is being introduced. Are home school parents more likely to shop at more upscale stores that play different music than the stores that the "other" people shop at (e.g., the music playing in Nordstrom is probably very different from the music playing at K-Mart). We don't know what the differences are between the sample and the actual population, but try to think of all the different things that could be different. The more ways that you can think of things could be different, the better you'll be able to start to understand how your sample is biased. Try to find data from reputable sources about the number of different stores in the state of Washington, the age distribution of the female population in Washington, etc. The more information you can get about the population you are trying to understand, the better you can understand how your sample is biased.

As to the definition of rock v. soft rock, the important thing at this stage is that you state the definitions that you are using to define "strong beat" from other music and to state the you left the difference between soft rock and rock to the discretion of the respondents.

I've got to run, but I'll post more later.

Beka · Post by **Beka** » Mon Mar 24, 2008 4:36 pm

Terik,

Here is the data for the population “Females in Washington over 30 years old with a home school bias.” There were 50 participants and 116 surveys. According to question 20, this group has a bias toward classical music for grocery stores even though only 1% of the stores visited had classical music playing.

Should I start comparing music types and frustration levels? Can I use the websites I used earlier to calculate the statistics even though I did not use SRS for the survey? I need to start making my display in the next few days!!!

Shopping & Music Survey

1. Age: 30-39 40-49 50-59 60+
30-39 years old (19%), 40-49 years old (46%), 50-59 years old (28%), 60+ (7%)

2. Gender: Male Female
female (100%)

3. Name of store: ________________________________
There were 43 different stores visited by subjects. Fred Meyer (20%), Safeway (15%), Albertsons (6%), Costco (7%), Bartell (4%), Goodwill (3%), Wal-Mart (3%), Winco (2%), Boarders (2%), JC Penney (2%), Target (2%), Macy’s (2%), Kohls (2%), Top Foods (2%), Value Village (2%), and 28 other stores with 1% each.

4. City, State: __________________________________________
100% from Washington state.

5. Number of different items purchased: 1-9 10-19 20+
0 (3%), 1-9 (56%), 10-19 (24%), 20+ (17%)

6. Number of people in your household:_____________________
1 person – (1%), 2 people – (11%), 3 people – (14%), 4 people – (34%), 5 people – (23%), 6 people – (7%), 7 people – (8%), 9 people – (1%), 11 people – (1%)

7. Number of children with you while shopping:_______________
0 children – (56%), 1 child – (21%), 2 children – (15%), 3 children– (5%), 4 children – (2%), 5 children – (1%)

8. Time spent shopping in the store: 1-15 minutes 15-30 minutes 30+ minutes
1-15 (26%), 15-30 (43%), 30+ (31%)

9. While shopping were you: Leisurely 1 2 3 4 5 6 7 8 9 10 Hurried
1 (15%), 2 (5%), 3 (9%), 4 (7%), 5 (21%), 6 (11%), 7 (9%), 8 (12%), 9 (4%), 10 (7%)

10. How many specific items did you pre-plan to purchase? 0 1-9 10-19 20+
0 (4%), 1-9 (66%), 10-19 (20%), 20+ (10%)

11. Did you use a written shopping list? Yes No
Yes (47%), No (53%)

12. About what percent of items purchased were not pre-planned? _______________
0% (46%), 1-9% (8%), 10-19% (10%), 20-29% (10%), 30-39% (4%), 40-49% (2%), 50-59% (12%),
60-69% (1%), 70-79% (1%), 80-89% (3%), 90-99% (0%), 100% (3%)

13. Did you remember everything you came for? Yes No
Yes (92%), No (8%)

14. What type of music was playing in the store?
Soft Rock Rock Classical No music Other ______________
Soft rock (41%), Rock (18%), Classical (1%), No music (26%), other (14%)

15. Was the Music: Very quiet 1 2 3 4 5 6 7 8 9 10 Very loud
0 (25%), 1 (3%), 2 (9%), 3 (20%), 4 (10%), 5 (9%), 6 (5%), 7 (7%), 8 (9%), 9 (2%), 10 (1%)

16. Did the music make it difficult to remember every thing you wanted to purchase? Yes No
Yes (12%), No (88%)

17. Was your shopping trip enjoyable?
Enjoyable 1 2 3 4 5 6 7 8 9 10 Frustrating
1 (21%), 2 (12%), 3 (15%), 4 (8%), 5 (14%), 6 (9%), 7 (7%), 8 (8%), 9 (4%), 10 (2%)

If not, was any frustration related to the music that was playing? Yes No
Yes (21%), No (79%)

18. Did you enjoy the music that was playing?
Enjoyable 1 2 3 4 5 6 7 8 9 10 Irritating
1 (16%), 2 (5%), 3 (11%), 4 (10%), 5 (20%), 6 (6%), 7 (7%), 8 (7%), 9 (6%), 10 (12%)

If not, did irritating music cause you to hurry and leave the store quicker? Yes No
Yes (19%), No (81%)

19. Do you feel the music distracted you from fulfilling your shopping plan? Yes No
Yes (15%), No (85%)

20. What type of music do you feel is best for a GROCERY shopping environment?
Soft Rock Rock Classical No music Anything
Soft Rock (27%), Rock (2%), Classical (51%), No Music (16%), anything (4%)

21. What music volume do you think is ideal for a GROCERY shopping environment?
Very quiet 1 2 3 4 5 6 7 8 9 10 Very loud
0 (14%), 1 (8%), 2 (12%), 3 (39%), 4 (15%), 5 (12%), 6-10 (0%)

Thanks,

Beka

deleted-2131 · Post by **deleted-2131** » Mon Mar 24, 2008 8:03 pm

Beka,

Here is the data for the population “Females in Washington over 30 years old with a home school bias.” There were 50 participants and 116 surveys. According to question 20, this group has a bias toward classical music for grocery stores even though only 1% of the stores visited had classical music playing.

This is good.

Should I start comparing music types and frustration levels?

Yes.

Can I use the websites I used earlier to calculate the statistics even though I did not use SRS for the survey?

You can use the site, but you need to make sure that the judges know you didn't use random chance to choose your sample. At this point, recognizing that and talking about how that influences your results is, in my opinion, more important than crunching numbers.

I would start by making lots and lots of tables that compared percentages of different variables, like you showed in your first post. If you don't know what a number that you calculate means, you shouldn't use that calculation.

Just to answer a couple quick things: "probability estimates are non-directional" means that the p-value doesn't say if something was higher than or less than the other something being compared, only that there is or is not a significant difference. (e.g. if you were comparing if miracle grow flowers grew taller than flowers watered with tap water, the p-value would not tell you if the miracle grow flowers were taller, only that they were/were no different.)

The "expected cell frequencies are equal to or greater than 5" is a safeguard to help keep people from blindly applying the Chi-square. If you don't have at least 5 in each category, there are a lot of issues that come up that.

Make sure you know what a P-value is and means. The worst possible thing to do is to crunch numbers, get P-values and other outputs, throw them on your board and not know what the different numbers mean. If there is something you don't understand, don't do it. Be sure that you understand all the calculations that you do!!!

Beka · Post by **Beka** » Tue Mar 25, 2008 11:06 am

Terik,

Thank you so much for all of your help! The state competition is in 10 days!

I have been reading a couple of statistics books and understand a lot more. I would like to do the statistics calculations myself, to confirm the website answers. However, the statistics books I am reading do not have the formulas, they just explain the concepts. I am going to borrow a college statistics book from my brother’s college library and believe I will be able to do the calculations with help from my family. I just have a few more questions from my earlier posts that I am still not sure about.

My hypothesis was: Music that is loud or has a strong rhythmic beat playing in a store will increase customer’s frustration levels and distract them from their shopping plans when they are in a hurry or have 10+ items to purchase.

Should I show results from the subjects that were hurrying (43%) and separately those with 10+ items to purchase (30%) to compare the results to my hypothesis? And then show results from all subjects, that there was a correlation even when the subjects were not hurrying or planning to purchase a lot?

Or should I show results from all the subjects and just refer to the percent of people in a hurry or with 10+ items?

It would have been a lot simpler if I had only written the first half of my hypothesis!

Another thing you need to be careful about is your wording. No matter how good your survey is, you cannot say that the music had an effect on the shoppers. You have no evidence that the music actually caused the observed effect on the shoppers. All you have is an observed relationship, which is called a correlation. You can say that shoppers who responded by saying that they were bothered by the loud music tended to be more frustrated, but you cannot then infer that the music actually caused them to be frustrated.

If I am careful with the wording of my conclusion and let the judges know some of the ways the study was biased, what kind of statement can I make about my hypothesis? It seems that I can’t really prove my hypothesis, but can show a correlation for a series of smaller hypothesis. Can I say my hypothesis was partially correct?

Then there is still the research question that I initially asked, “Does the type of music played at a store affect customer’s satisfaction and shopping performance?” Since I did not end up using the terms ‘satisfaction’ and ‘performance’ in my survey, is it okay to rewrite this question to reflect my hypothesis? “Does music that is loud or has a strong rhythmic beat playing in a store influence customers’ frustration levels or distract them?”

Thanks,
Beka

deleted-2131 · Post by **deleted-2131** » Fri Mar 28, 2008 9:14 am

Beka,

Apologies for the delayed response. I am so glad that you have been reading up on some stats books. Hopefully you are gaining a deeper understanding and appreciation for what statistics, when wisely used, can contribute to data analysis.

Should I show results from the subjects that were hurrying (43%) and separately those with 10+ items to purchase (30%) to compare the results to my hypothesis? And then show results from all subjects, that there was a correlation even when the subjects were not hurrying or planning to purchase a lot? Or should I show results from all the subjects and just refer to the percent of people in a hurry or with 10+ items?

You can certainly do either of these. You could also do both. Just remember to state your null hypothesis and your alternative hypothesis for each statistical test that you perform. Look up these terms in your stats book if your not clear on them.

If I am careful with the wording of my conclusion and let the judges know some of the ways the study was biased, what kind of statement can I make about my hypothesis? It seems that I can’t really prove my hypothesis, but can show a correlation for a series of smaller hypothesis. Can I say my hypothesis was partially correct?

Your can't say your hypothesis is correct unless your statistical tests show that it is, and even then you have to be careful. IF the results of the statistical tests you run show a significant difference, you can say something along the lines of "My data provide convincing evidence of a strong correlation between music with a strong, rhythmic beat and customer frustration levels." Use this only if you are able to calculate a correlation between your data. If you are not able to calculate a correlation, you might say something like this: "My data also strongly suggest a correlation between music that is loud or has a strong, rhythmic beat and customer frustration levels and distraction. Evidence suggests that this correlation is influenced by the number of items a customer purchased and whether or not the customer is in a hurry."

Changing your question is fine. It's part of the scientific process!

Let me know if you have any more questions. I hope things go well at State!

Beka · Post by **Beka** » Fri Mar 28, 2008 9:22 am

Thank you!!!!

I will write after the state competition and let you know how I did!

Beka

Beka · Post by **Beka** » Thu Apr 03, 2008 7:57 am

Terik,

Just two more days until the fair!

If you see this before then, I have a couple more statistic questions.

I read that the Phi value (Person Correlation Coefficient) should be over .75 to show strong coorelation, but they are often incorrect with p-values < .001. Most of my p-values were < .001 and most of my Phi values were between .25 and .6, do I need to include the phi values to say I found a strong correlation? I feel I have enough to explain with the chi-square and p-values!

Also, should I include my popluation description and biases in my conclusion? I have them listed in the discussion directly under the conclusion on my display.

Thanks again for all of your help!

Beka

deleted-2131 · Post by **deleted-2131** » Thu Apr 03, 2008 10:15 am

Beka,

I'm not sure that I understand your question. The Pearson Product-Moment Correlation Coefficient is generally referred to as r, but there are many different types of correlation and the symbols used sometimes vary from text to text. What text are you using? Perhaps you could post the formula that you are using as well as a detailed description of the process you went through to calculate the statistic in question.

I'm also a bit confused about what you mean by "I feel I have enough to explain with the chi-square and p-values!" The Chi-square is a test that produces a p-value. Do you mean that you have enough to explain about how and why you used a Chi-square and then the interpretation of the p-value produced? Or are you using another method in addition to Chi-square to get an additional p-value. You need to be careful that you aren't doing lots and lots of different kinds of statistical tests lots and lots of times because then you get into issues associated with multiple testing. It's a bit much to explain two days before the fair, but if you go on, we can discuss this more.

Your conclusion should be short, sweet, and to the point. Nothing is more unhelpful than a lengthy conclusion. I would say something along the lines of "I found strong evidence that [population description here] [insert conclusion here]." You can discuss the sampling issues in your discussion.

Beka · Post by **Beka** » Thu Apr 03, 2008 12:08 pm

Terik,

Thanks for your quick reply!

Yes, I feel I have enough to explain about how and why I used a Chi-square and the interpretation of the p-value produced. (This has been a crash course in statistics for me!)

I am using the following website for the statistical tests, I did not do the math myself.
http://faculty.vassar.edu/lowry/odds2x2.html
It does chi-square for data where all the expected cell values are over 5 or Fisher Exact Probability if one of the expected cell values are under 5. Most of my 'frustrated' tests use chi-square and my 'distracted' tests use Fisher, is that okay? Also, the website gave the p-values for Yates and Pearson. Is that okay that I listed both p-values on my display board?

One of the values the website produces the a 'phi coefficient'. I spent a long time trying to track down what this meant and if it is part of the data for the Chi-square or a totally different test, and I am still confused.

The main questions are
If p-values are <.05, does the chi-square test show correlation? Or do I need the phi coefficient to be in a certain range as well? Or is that a different test all together (Pearson Product-Moment Correlation Coefficient)?

There is an empty spot on my display waiting for my conclusion. I'm really having trouble with the wording. Which of these do you think is best?

My data provide convincing evidence that women over 30 years old in Washington State, who responded by saying store music was loud or had a strong rhythmic beat, tended to be more frustrated and distracted.

or

My data provide convincing evidence of a strong correlation between music with a strong, rhythmic beat and customer frustration and distraction, for women over 30 in Washington State.

or

I found a strong correlation that women over 30 in Washington State tended to be more frustrated and distracted when they described the store music as loud or having a strong beat.

Thanks again,
Beka

deleted-71447 · Post by **deleted-71447** » Fri Apr 04, 2008 8:42 am

Hi Beka,
I don't want to butt in, but I see this is a time sensitive issue for you. In general, simplicity in presentation is good. You will want to make sure that your audience sees the main points of your experiment, and having multiple sets of statistical results can distract them from your more important conclusions. I recommend reporting only the Fisher test, which I believe will be reported for all your comparisons (and not just when one or more expected values are <5). You should also double check the chi-square results (when available) to satisfy yourself that the two tests give the same results in terms of whether a relationship is significant or not. You can keep this in mind in case it comes up in questions, but there's no need to confuse people with two types of tests.

Here is a page that describes the phi-coefficient:
http://www.andrews.edu/~calkins/math/edrm611/edrm13.htm
and another with some general rules about how to interpret them:
http://www.childrensmercy.org/stats/definitions/phi.htm
The phi-value will show you what sort of correlation you have (positive, negative, or none) and the p-value will tell you whether this correlation is statistically significant (and whether it is worth reporting). You can report your phi-values with accompanying interpretation (positive correlation, no correlation, negative correlation) but make sure that you keep in mind that it is possible to have a strong correlation that is not statistically significant (p is large; could result from random error) due to a small number of samples, and it is possible to have a relatively weak correlation that is statistically significant (p is small; is highly unlikely to result from random error) when the number of samples is large.

All 3 versions of your conclusions sound fine to me. Another possibility, if you prefer to avoid using the first person, would be something like "A strong correlation exists between music with a strong, rhythmic beat and customer frustration and distraction for women over 30 surveyed in Washington State." You might want to substitute "significant" for "strong".
Good luck!
Chris

deleted-2131 · Post by **deleted-2131** » Fri Apr 04, 2008 5:03 pm

Beka,

I echo what Chris said completely.

Chris,

Thanks for jumping in!

--Terik

Beka · Post by **Beka** » Fri Apr 11, 2008 3:13 pm

Terik & Chris,

Thank you for all of your help! I did not have time to redo the discussion section on my display and you were right about the multiple tests distracting the judges!

I did not make it to Intel ISEF this year.

The competition at our state fair is very tough and I do not have a regional fair that I can go to, but I have two more years to try!

I am very happy about all of the things I learned through this process and know it will help me in college.

Thanks again,

Beka

deleted-2131 · Post by **deleted-2131** » Fri Apr 11, 2008 5:00 pm

Beka,

I'm glad that you're feeling comfortable with the outcome of the fair. We wish you'd gone on to ISEF, too, but I'm glad that you are ready to keep trying. You've learned a lot this year, both about science and about how to present it, and all of that experience will help you be able to do an even better project next year. Congratulations on a job well done, on your dedication, your hard work, and your tenacity.

Terik

Statistical Significance

Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance

Re: Statistical Significance