Analysing country level data of Luxembourgish financial aid for students, which gives insights into where students go, what they study and what subjects are prevalent by sex
What is Luxembourgish financial aid for higher education?
The “aide financière” is offered by the Luxembourg government to students enrolled on an accredited higher education course offered by an accredited institution and meeting certain admission criteria. It comprises several grants and a student loan. It is paid in semi-annual instalments (per academic year).
Description of the Data
The data for this post comes from the Ministry of Higher Education and Research (MESR) in Luxembourg and is about applications for state financial aid for higher education made by students to the government. The data includes information on the status (rejected, accepted) of applications, as well as the amounts that are paid out each semester.
The two tabular data sets that I am mainly concerned with are the actual amounts paid out…
In the following, I’ll ask and answer questions that I was interested in while inspecting the data set.
How many applications for state financial aid are made each semester?
Generally, it can be seen that relatively more applications are made for the summer semester, presumably because more students start their studies in the winter semester and potentially drop out in the subsequent one. Generally, seeing more than 30,000 applications is astonishing, given that this is 5% of the country’s population.
Code
effectifs %>%group_by(date) %>%summarise(etudiants =sum(total_etudiants, na.rm = T)) %>%ungroup() %>%mutate(yoy = etudiants/lag(etudiants, n =2) -1) %>%ggplot(aes(date, etudiants)) +geom_line(lty ="dotted") +geom_point() +labs(title ="Applications made to the MESR each semester",y ="Applications",x =NULL) +scale_y_continuous(labels =comma_format())
The chart below shows that there are more requests from female students than from male students, likely an indication of the latter being underrepresented in higher education.
Code
effectifs %>%group_by(date, sexe) %>%summarise(etudiants =sum(total_etudiants, na.rm = T)) %>%ggplot(aes(date, etudiants, colour = sexe)) +geom_line(lty ="dotted") +geom_point() +labs(title ="Applications by sex over time",colour ="Sex",y ="Students",x =NULL) +scale_y_continuous(labels =comma_format())
How many applications are rejected?
Looking at the rejection percentage out of all applications, it looks like there is a slight upwards trend. Notably, summer semester applications are rejected at a higher rate, likely because new students make more mistakes in their application procedure and have to try again.
Interestingly, applications from male students get rejected at higher rates. This might have several reasons, for example carelessness when preparing applications or a certain criteria that male students do not fulfil at higher rates. As there are not too many predictors in the data, it will be hard to find the true reason here.
Looking at rejections by country, it becomes clear that 1) some countries have higher rates of rejection than others and that 2) there is a time trend for some countries whereas there is none for others. Again, finding reasons for that with the data at hand is likely not possible.
The chart below shows an upwards trend in popularity for Austria, France, Canada, Spain, Ireland, Italy and the Netherlands. On the contrary, Belgium, Switzerland, Germany and the United States have lost popularity.
Code
effectifs %>%select(date, pays_etablissement_iso3, total_etudiants) %>%group_by(date, pays_etablissement_iso3) %>%summarise(students =sum(total_etudiants)) %>%ungroup() %>%group_by(date) %>%mutate(students = students/sum(students)) %>%ggplot(aes(date, students, colour = pays_etablissement_iso3)) +geom_point() +geom_line() +geom_smooth(method ="lm", se = F, lty ="dotted", size =0.5) +labs(title ="Which countries do Luxembourgish students choose for their studies?",y ="Percentage of students",x ="Year") +facet_wrap(~ pays_etablissement_iso3, scales ="free") +scale_y_continuous(labels =percent_format()) +theme(legend.position ="none",panel.grid.major.x =element_blank(),panel.grid.minor.x =element_blank(),panel.grid.minor.y =element_blank())
The most popular countries in absolute terms can be seen below:
Code
effectifs %>%transmute(year =year(date), country = pays_etablissement_iso3, total_etudiants) %>%group_by(year, country) %>%summarise(students =sum(total_etudiants, na.rm = T)) %>%ggplot(aes(x = students,y = country %>%reorder_within(students, year))) +geom_col() +facet_wrap(~ year, scales ="free_y") +labs(title ="CEDIES applications by year and country",y =NULL,x ="Students") +scale_y_reordered() +scale_x_continuous(labels =comma_format())
It is also interesting to see in the chart below that many more female students go to Belgium. There are likely universities in Belgium which offer study subjects that female students choose more frequently than men.
Code
effectifs %>%transmute(year =year(date), sexe, country = pays_etablissement_iso3, total_etudiants) %>%filter(!country %in%c("Pays manquant", "Autres pays")) %>%group_by(year, sexe, country) %>%summarise(students =sum(total_etudiants, na.rm = T)) %>%mutate(students = students/sum(students)) %>%ungroup() %>%pivot_wider(names_from = sexe, values_from = students) %>%mutate(ppts_delta = (`F`- M)*100) %>%ggplot(aes(ppts_delta, country %>%reorder_within(ppts_delta, year),fill =ifelse(ppts_delta >0, "More Women", "More Men"))) +geom_col() +labs(title ="Countries chosen for studies by Luxembourgish students applying for\nfinancial aid by year and sex",subtitle ="Methodology: Total number of students by year and sex is basis for calculation.\nCalculate percentage by year and sex going into each country, then subtract both percentages.\nResulting metric is the delta between both sexes for each country and year in percentage points.",y =NULL,x ="ppts Difference (Women-Men)",fill =NULL) +facet_wrap(~ year, scales ="free") +theme(axis.text.y =element_text(size =8)) +scale_y_reordered() +scale_x_continuous(labels =comma_format(suffix =" ppts")) +scale_fill_manual(values =c("midnightblue", "firebrick")) +theme(axis.text.x =element_text(angle =90, hjust =1, vjust =0.5,size =7))
Which studies do Luxembourgish students choose?
Interestingly, architecture, education, languages, medicine and anthropology are strongly losing popularity. Conversely, computer science, engineering, mathematics, health professions and psychology are strongly gaining in popularity, which is likely linked to generally higher pay in these areas.
Generally, most students go into business studies or health professions:
Code
effectifs %>%transmute(year =year(date), country = domaine_formation, total_etudiants) %>%group_by(year, country) %>%summarise(students =sum(total_etudiants, na.rm = T)) %>%ggplot(aes(x = students,y = country %>%reorder_within(students, year))) +geom_col() +facet_wrap(~ year, scales ="free_y") +labs(title ="CEDIES students' studies by year and country",y =NULL,x ="Students") +scale_y_reordered() +scale_x_continuous(labels =comma_format())
Is there a difference in interests between female and male students?
The quick anwer: Yes. The more difficult answer would be a proper piece of research into the reasons. However, it becomes blatantly obvious that more women go into health professions, languages and education, whereas men choose engineering, business studies and computer science more often.
Code
effectifs %>%transmute(year =year(date), sexe, domaine_formation, total_etudiants) %>%group_by(year, sexe, domaine_formation) %>%summarise(students =sum(total_etudiants, na.rm = T)) %>%mutate(students = students/sum(students)) %>%ungroup() %>%pivot_wider(names_from = sexe, values_from = students) %>%mutate(ppts_delta = (`F`- M)*100) %>%ggplot(aes(ppts_delta, domaine_formation %>%reorder_within(ppts_delta, year),fill =ifelse(ppts_delta >0, "More Women", "More Men"))) +geom_col() +labs(title ="Study subjects chosen by Luxembourgish students applying for\nfinancial aid by year and sex",subtitle ="Methodology: Total number of students by year and sex is basis for calculation.\nCalculate percentage by year and sex going into each study domain, then subtract both percentages.\nResulting metric is the delta between both sexes for each subject and year in percentage points.",y =NULL,x ="ppts Difference (Women-Men)",fill =NULL) +facet_wrap(~ year, scales ="free") +theme(axis.text.y =element_text(size =8)) +scale_y_reordered() +scale_x_continuous(labels =comma_format(suffix =" ppts")) +scale_fill_manual(values =c("midnightblue", "firebrick")) +theme(axis.text.x =element_text(angle =90, hjust =1, vjust =0.5,size =7))
Are there any observable trends?
Code
effectifs %>%select(date, sexe, domaine_formation, total_etudiants) %>%group_by(date, sexe, domaine_formation) %>%summarise(students =sum(total_etudiants, na.rm = T)) %>%ungroup() %>%pivot_wider(names_from = sexe, values_from = students) %>%mutate(male = M/(`F`+ M)) %>%select(-c(`F`, M)) %>%ggplot(aes(date, male, colour = domaine_formation)) +geom_line() +geom_point() +labs(title ="Percentage of male students by study subject",y =NULL, x =NULL) +geom_smooth(method ="lm", se = F, lty ="dotted", size =0.5) +labs() +facet_wrap(~ domaine_formation, scales ="free_y") +scale_y_continuous(labels =percent_format()) +theme(legend.position ="none")
It looks like the percentage of male students decreases in the majority of subjects. Are men moving out of higher education?
Code
effectifs %>%select(date, sexe, total_etudiants) %>%group_by(date, sexe) %>%summarise(students =sum(total_etudiants, na.rm = T)) %>%ungroup() %>%pivot_wider(names_from = sexe, values_from = students) %>%mutate(male = M/(M +`F`)) %>%ggplot(aes(date, male)) +geom_line() +geom_point() +geom_smooth(method ="lm", se = F, lty ="dotted", size =0.5) +labs(title ="Percentage of male students among CEDIES applicants",y =NULL,x =NULL) +scale_y_continuous(labels =percent_format())
If you made it until here, thank you very much for reading and I hope you enjoyed this post. Please feel free to reach out.