“Should I stay or should I go?” Indicators of Dropout Thoughts of Doctoral Students in Computer Science

The project on which this report is based was promoted with funds from the Federal Ministry of Education and Research under the reference number 01FP1707. Responsibility for the contents of this publication lies with the authors. Abstract Evidence in the literature indicates that doctoral candidates may experience increased levels of stress and worry about successfully completing their doctorate degrees. As a result, a significant number of doctoral candidates drop out. In our study with 424 doctoral students in computer science (113 women, 311 men), we ask about the frequency of dropout thoughts as an indicator of possible premature termination. By means of machine learning algorithms, we extract variables associated with higher or lower likelihood of dropout thoughts. In particular, satisfaction with advisor’s support , experiencing a crisis , professional self-efficacy , choice of advisor , and perceived meaningfulness of additional work tasks proved to be of central importance. Based on these results, we suggest taking steps to improve professional and social support for doctoral students. Recommendations include implementing more intensive supervision in the early stages of the doctorate, improve the match between doctoral candidates’ expectations and the requirements of the respective institute, monitor progress during the doctorate (e.g., with the help of an advisor agreement), and increase the qualifications of advisors to include leadership and communication skills.


Difficulties during A Doctorate
In a scan of the literature, a growing number of empirical studies point to heightened psychological distress and mental health problems of doctoral students, independent of the dissertation subject. As Cornwall et al. (2019, p. 363) noted, "stress during doctoral study is common", and Barry, Woods, Warnecke, Stirling, and Martin (2018, p. 468) observed that "psychological distress is prevalent in doctoral degree training". Since a doctorate is time and energy consuming, may take approximately five years to finish and costs a great deal of money, the assumption of high stress is not surprising. Work-related stress symptoms include, among others, time and financial pressures (Cornwall et al., 2019), perceived social pressures due to the work-related environment (Stubb, Pyhä lto, & Lonka, 2011), concerns about the future and performance-related concerns (Barry et al., 2018). Considering this situation, it is not surprising that quite a few doctoral students think about terminating their studies. Jaksztat, Preßler, and Briedis (2012) conducted a questionnaire survey with 2,850 doctoral candidates (1396 women) of German universities and found that 43% of the respondents considered discontinuing their doctoral studies-women more often than men (46 vs. 40%) and doctoral students with teaching obligations more often than those without such obligations (48 vs. 39%). Nearly half of those at risk of dropping out questioned their competence, and a considerable number of those potential dropouts were dissatisfied with the support from their mentors and from the scholarly community. In a similar vein, more than half of the participants in the study of Stubb et al. (2011) perceived the scientific community within and outside of their faculty as a source of burden instead of support and encouragement.

Gender and Success in PhD Studies
Whereas the results of Jaksztat et al. (2012) seem to corroborate the assumption that women may be more prone to consider dropping out, the results of other studies do not reveal this gender difference. Seagram, Gould, and Pyke (1998) and Wright and Cochrane (2000) found equal success rates for male and female doctoral candidates. Likewise, Wollast et al. (2018) found no gender difference in the success rates of Belgian PhD students. What mattered more was the type of subject. Compared to humanities students, science and technology students submitted their theses after a shorter period (Wright & Cochrane, 2000) and had a higher rate of doctorate completion (Wollast et al., 2018). Even though the number and duration of successful doctorate completion may not differ by gender, women and men seem to have different levels of ease in typically female or male subjects. For example, in male-dominated subjects (such as engineering and technology), women were found to have a lower academic self-concept and lower career ambitions than men (Ülkü-Steiner, Kurtz-Costes, & Kinlaw, 2000), and they were more likely to drop out, particularly in their first year of studies (Bostwick & Weinberg, 2018). On the other hand, both female and male doctoral students felt less accepted in faculties that did not match their career aspirations and values (Pifer & Baker, 2014), for example, when attempting to combine their career and family or strive for job security.
Summarizing the research results so far, it can be concluded that even though doctoral students are prone to high psychological distress, supportive, competent and positive advisor behaviour may help to diminish distress and be a key factor in the successful performance of doctoral students. Additionally, the support of other social agents, such as family, friends, faculty members and colleagues, may be positively related to the well-being of doctoral students (Schmidt & Hansson, 2018) but somewhat less related to success (De Clercq et al., 2019).

Research Question
The research results so far indicate that doctoral students seem to be prone to high psychological distress. On the other hand, Individual and structural variables may contribute to stress reduction or, on the contrary, to heightened stress. Individual variables include academic self-concept/self-efficacy and gender/gender role self-concept (paragraph 1.3). As computer science is a predominantly male subject in many countries, including in Germany where our study was conducted, we were interested in finding possible special conditions for male and female doctoral students in computer science. We therefore assessed not only perceived stress but also individual variables such as self-efficacy, biological sex, and gender role self-concept, which could all play a role in the process of finishing a doctorate.
Among the structural variables that may have an impact on doctoral students' perceived stress, advisors' support is deemed most important (s. above 1.2). In particular, supportive, competent and positive advisor behaviour may help to diminish distress and be a key factor in the successful performance of doctoral students. In our survey study reported later, we therefore investigated the working conditions of computer science doctoral students, their perceived support and the extent of dropout thoughts. To this end, we developed an online questionnaire that contains sociodemographic variables, perceived working conditions, and psychological variables such as self-efficacy and gender role self-concept. In this paper, we will specifically ask about risk factors for possible dropout from a PhD programme. To identify the most important indicators, we will use the decision tree method, which is a kind of machine learning method that will be explained below (2.4).

Context
Since the analyses in this paper are based on the results of a study conducted in the German-speaking higher education sector, it is important to discuss the peculiarities of the German higher education system. Compared to other countries, Germany has a relatively high rate of doctorate degrees awarded. In the understanding of the German Rectors' Conference (Hochschulrektorenkonferenz, HRK, board of directors of universities and colleges), the training of doctoral students is not the third stage of studies following the acquisition of bachelor's and master's degrees but rather the first stage of a research career or any other career that requires proof of independent research performance (dissertation). This is particularly essential for any student in the natural sciences who plans to work as a scientist in a research lab (at universities, public or private companies and the like). However, in other subjects, obtaining a doctorate may also be helpful for promotion in an academic career. Therefore, this also applies to computer science.
Typically, a master's degree is a requirement for a doctoral degree, and the time span to earn a doctoral degree ranges from three to five years, but the period may also be shorter or longer. The main objective during that time is to write a thesis by means of either a monograph or several publications under the supervision of one or more advisors. This may happen within research groups or individually. Depending on the PhD regulations of a faculty, additional requirements may apply, e.g., additional courses or publications. The majority of doctoral candidates are affiliated with a faculty and earn their living either by means of a scholarship or as employees of the faculty/university. In the latter case, this may be connected with teaching obligations of up to four teaching hours per week (plus preparation and follow-up) for seven months per year. Only a small minority of doctoral candidates (so-called external candidates) earn their living outside the university system.
In computer science subjects in Germany, 7,297 doctoral students were pursuing a doctorate in 2019 (Statistisches Bundesamt, 2020a, p. 29), of whom 1,334 were women (equivalent to 18.3%). In 2019, 1,032 students earned a PhD in computer science (Statistisches Bundesamt, 2020b, p. 13), 169 (16.4%) of whom were women (ibid, p. 15). Assuming all doctoral students would finish their doctorate, the duration of the doctorate would correspond to seven years. Unfortunately, there are no statistics on the actual number of dropouts from a doctorate programme. However, if the average duration of a doctorate degree is estimated at five years, approximately 2,100 persons would not complete their doctorate in this subject, which would correspond to 30% of doctoral students.

Participant Characteristics
The data used in this study are based on an online survey that we conducted with German doctoral students of computer science who were contacted via the respective faculties and universities. The questionnaire was started by 763 persons and finished by 433 persons. Due to inconsistent data as well as a high frequency of missing values, nine observations had to be excluded from the sample, so the following report refers to a sample of 424 persons containing 113 female doctoral students (26.7%) and 311 male doctoral students (73.3%). The proportion of women in our sample is larger than the proportion (18.3%) found in the data of the Federal Statistical Office (Statistisches Bundesamt 2020a, p. 29). Correspondingly, fewer men (73.3% vs. 81.7%) were included in our sample. The respondents in our study were on average M = 30.7 years old (SD = 4.09). A total of 49.5% of the respondents studied computer science, followed by business informatics (12.0%), engineering (4.5%) and media informatics (4.5%). The remaining 22.4% (n = 98) studied various subjects. When asked which field they would classify their doctoral thesis in, 59.5% of the women and 47.5% of the men answered applied computer science, 28.3% answered practical computer science (15.3% of the women and 32.9% of the men), and 21.2% answered theoretical computer science (25.5% of the women and 19.6% of the men), not taking into account those participants who classified their subject as "Other" or were not yet able to specify it. The gender difference is significant (X² = 11.04; df = 2, N = 378; p = .004).

Recruitment
The survey was conducted online via the platform unipark (unipark.de; now questback.com). The respondents were contacted via the relevant faculties, institutes, doctoral colleges, and graduate schools of German universities; via advisors; and through networks for doctoral students. For this purpose, a letter was sent to possible respondents with a corresponding link. The respondents were informed about the purpose of the research and data protection, pointing out that participation was voluntary and that they could cancel their participation any time. The survey data were then stored anonymously. The respondents did not have to answer all questions to finish the survey.

Statistical Analysis: Decision Tree and Random Forest
In this study, decision trees and random forests were used to explore the underlying factors of the frequency of dropout thoughts. Based on machine learning, these methods differ from conventional regression models. They do not estimate parameters but rather predict values for the variables of interest based on the test samples. Basically, this method offers two advantages: First, nonlinear relationships and interactions are better accommodated than in traditional linear models, and second, there are no restrictions regarding the total number of variables considered in a model, which is especially useful for exploratory purposes (Molina & Garip, 2019, p. 30f.). Furthermore, decision trees "are popular because of their easy interpretability and low bias" (Behr, Giese, Teguim, & Theune, 2020, p. 753).
Based on the principle of recursive partitioning, the CART algorithm is used to create decision trees and random forests. Starting from a single cluster that represents the whole sample, "the C[A]RT procedure examines all possible independent, or splitting, variables and selects the one that results in binary groups [/clusters] that are most different with respect to the dependent variable, according to a predetermined splitting criterion" (e.g., Gini impurity) (Lemon, Roy, Clark, Friedmann, & Rakowski, 2003 p. 173). Furthermore, the algorithm continues to split both the newly created groups and the new resulting subgroups until a stopping rule is reached. The result is a tree whose leaves correspond to subgroups "whose members share common characteristics that influence the dependent variable of interest" (Lemon et al., 2003, p. 173).
The abovementioned stopping rules are specified to control the depth of the tree in order to prevent overfitting. The decision trees shown later are pruned so that they comply with the 'one standard error rule': This rule specifies using the tree with the smallest number of leaves, whose misclassification cost is less than or equal to the sum of the classification cost and standard error of the tree with the lowest classification cost (Breiman, Friedman, Olshen, & Stone, 1984, p. 80).
Regarding the data set, a slight restriction was made to calculate the decision tree. These 150 variables were used, which already showed significant correlations in pre-analyses with the frequency of dropping out thoughts. Not every participant answered every single question completely. In case of a missing answer, the algorithm simply uses the next best predictor variable, so this split is similar to the best split (Twala, 2009, p. 6ff.). This 'next best predictor variable' is called a surrogate split. The next step is an evaluation using random forests. In random forests, several decision trees are calculated, and the results are averaged. This is based on a method called bootstrap aggregation (short: bagging). To reduce the high variance of a single decision tree, a number of trees (in this study: 500) is aggregated and fitted with a certain number of bootstrap samples from the data (Breiman et al., 1984). The term "random" considers that with a certain random probability, variables are used at higher-level decision nodes, which otherwise would have been used only in the lower levels of a decision tree due to their comparably low predictive power for the dependent variable. This is to counteract the underestimation of the predictive power of a variable. Overall, random forests lead to more accurate predictions represented graphically in this study by a minimum depth plot.
Unlike single decision trees, forests based on the CART algorithm we used cannot replace missing answers with surrogate splits. To maintain a certain number of cases, the number of variables used had to be limited. To do so, we based our preselection of variables on the results of the decision tree: The variables used directly from the decision tree and its surrogate splits were included.

Measures
In the online questionnaire, we considered a number of variables and standardized measurement instruments that seemed to be important predictors for doctorate completion, derived from the literature review, as summarized in the introduction. Apart from demographic variables such as age, marital status, gender, and master's degrees, we asked for more detail about the doctoral phase, e.g., vocational training, topic of the thesis, duration of the doctorate, perceived stress, advisors' support and experiences working at the university. We also asked about previous publications and satisfaction with the support experienced from the advisor and other agents with regard to the specification of the research question and the doctorate in general. Using standardized measuring instruments, we asked about perceived social support (Fydrich, Sommer, Tydecks, & Brä hler, 2009), the gender role self-concept of instrumentality and expressiveness (Goldschmidt, Linde, Alfermann, & Brä hler, 2014), the professional motives of the respondents (Kanning, 2016), and their professional self-efficacy (Abele, Stief, & Andrä , 2000). The psychological measures were standardized tests developed with representative samples. The other questions were based on a published questionnaire of the Statistisches Bundesamt and on the Jasztat et al. (2012) study. The questionnaire was pretested by four experts on the research team, including two of the authors.
As our main objective in this study is to determine which variables may be regarded as most important for the decision of doctoral students to continue or to drop out during their doctoral education, we asked our participants about any dropout thoughts they had in the past. This can be regarded as a kind of criterion variable. We assessed thoughts about dropping out of the doctorate process with the following question: "How often have you thought about stopping your doctorate while working on it?" Participants could rate their answer on an eleven-point scale from 0 (never thought about it) to 10 (thought about it very often). To better understand the following results, the most important variables are presented here: (1) Experiencing crises: It was asked whether crises had occurred so far in the doctoral process. The answer options were 1) "no, I had no major crisis so far"; 2) "I had some difficulties, but not a major crisis"; and 3) "yes, I had a crisis".
2) Advisor's support: "When you think back to the time when you specified the research question of your doctorate, how satisfied were you with the support of your thesis advisor?" Participants answered on a 5-point Likert scale from 1 (highly dissatisfied) to 5 (highly satisfied).
3) Professional self-efficacy: Participants responded to six items on 5-point Likert scales ranging from 1 (not true for me) to 5 (completely true for me). The scale measures dispositional self-efficacy regarding one's profession and was published by Abele et al. (2000). Its internal consistency is α = .78.

4)
Research topics of the professor as the selection criterion: "Which characteristics are (have been) particularly important to you with regard to the research topics of the advisor of your doctoral thesis?" Participants answered on a 5-point Likert scale from 1 (not important at all) to 5 (very important).

5)
Usefulness of additional tasks regarding the organization of events not related to seminars: "With regard to your scientific career, how meaningful do you consider these assigned tasks, which have no relation to your doctorate?" Participants answered on a 5-point Likert scale from 1 (not at all meaningful) to 5 (very meaningful).

Results
In the first step, we used the decision tree to determine which criteria the group can be divided into with respect to the variable "thoughts about dropping out of the PhD". After describing the clusters, we introduce the random forest, which is averaged over many decision trees and indicates relevant variables that (possibly) increase the probability of dropout thoughts. Combining these two approaches, we derive recommendations to reduce the risk of dropout thoughts.

Decision Tree
Using the decision tree, five key variables were found to be significant in dividing the whole group into six different clusters in terms of the predicted frequencies of dropout thoughts ( Figure 1). The clusters are described in terms of their most important criteria, starting with the variables at the lowest level and ending with the variable at the highest level. The total group of 424 individuals had a mean value of dropout thoughts of 3.5. Since it is measured on an 11-point scale, this mean value means that the doctoral students occasionally but not frequently thought about dropping out of their doctoral studies. Looking at the second level of the decision tree, we see that the majority of respondents had not experienced a crisis (70%); these respondents can be classified into Clusters 1 and 2. The participants who said that they had experienced a crisis were grouped in Clusters 3, 4, 5 and 6.

Figure 1. Decision tree of dropout thoughts
Explanation: The first number in the blue boxes refers to the mean value of the cluster in terms of dropout thoughts; the second number indicates what percentage of the total group has been classified into this cluster. Yes and no refer to the cut-off values for being assigned to a different cluster based on the answers regarding the variables.

Cluster 1 -The Well Supported
Individuals in this cluster (34% of the sample) reported the lowest number of dropout thoughts (score of 2.0very rarely). Respondents in this cluster were (very) satisfied with the support they received from their advisor, especially during the phase of identifying their scientific topic. They had not experienced any serious crisis so far.
The doctoral candidates felt well supported at the beginning of the research process, from which it can be concluded that they received the support they considered appropriate (although no statement was made about the quantity of support in this initiation phase). Here, the support in the initial phase seems to be of particular importance to avoid dropping out of the PhD process. Those who felt well supported at the beginning seemed to continue their research with self-confidence and without further doubts.

Cluster 2 -The Unsure Starters
Cluster 2's average score of 3.3 on the dropout thoughts scale is close to the average score for the overall sample. This cluster consists of more than one-third of the sample, 36%, and is the largest cluster. These respondents tended to be dissatisfied with the support they received from their advisor specifically during the initiation phase of their dissertation. However, they had not yet experienced a serious crisis. Thoughts of dropping out of the doctoral programme arose only rarely.
In contrast to the experience of individuals in Cluster 1, the initial phase of the doctorate did not seem to be satisfactory for individuals in Cluster 2, as it made them think about dropping out of their doctorate somewhat more often than respondents from Cluster 1. Cluster 2 did not receive the expected support from their advisor that they had hoped for when developing the research topic. It is interesting to note that they occasionally considered abandoning their doctorate but did not relate this to experiencing a crisis. There may be doubts here about the meaningfulness of the doctorate, especially in the phase of scientific positioning, which is not experienced as a crisis but refers to an orientation phase. Both Clusters 1 and 2 indicate that the initial phase is a sensitive phase in which the dissertation advisor should pay great attention to the support needs of junior researchers.

Cluster 3 -The Research Oriented
Individuals in Cluster 3 had the same value regarding the frequency of dropout thoughts (3.3) but differ significantly from those in Cluster 2. On the one hand, only 11% of the total sample was assigned to Cluster 3. On the other hand, they differ even more significantly in terms of having experienced a crisis at least once.
Although they have only thought about dropping out of their doctoral studies from time to time, certain conditions have caused them to experience a crisis, thus unsettling these respondents for a certain period of time. In addition, other variables play a role. First, the individuals in this cluster have high expectations for professional self-efficacy. Second, the professor's research topics were of great importance to the doctoral students when choosing their advisor. This indicates that the respondents from Cluster 3 were more intensively concerned with a scientific topic even before they started their doctoral studies. Likewise, it can be concluded that they also perceive themselves as suitable and competent enough for scientific work. The interviewees thus seem to have succeeded well in their own approach to a research topic, and they perceive themselves as self-efficient. Thus, the respondents can be described as research oriented and may be more focused on a scientific career than individuals in the other clusters. However, they do experience crises, which should merit attention despite the low frequency of dropout thoughts.

Cluster 4 -The Career Oriented
Cluster 4 contains a very small portion, 4% (17 respondents), of the total sample. With an average score of 3.9 on the dropout thoughts scale, this cluster was slightly above the average for the overall sample. Respondents in this cluster had already experienced a crisis but had relatively high levels of career self-efficacy. The professor's research topics were less important to the individuals of Cluster 4 than to those of Cluster 3, and they differed in terms of the relevance of another variable, the perceived usefulness of additional work tasks.
Doctoral students rated the organization of events (conferences, symposia, etc.) as (very) useful with respect to their career. One could assume that these few individuals do not see themselves so much exclusively as (junior) scientists but rather as employees of the university who do not experience the tasks assigned to them as a hindrance. It is also possible that doctoral students experience these tasks as meaningful in terms of their career opportunities. However, if one assumes that with each additional task that is not directly related to one's own research work, the overall work progress can be reduced and the time extended, feelings of overload may follow and may manifest in thoughts of dropping out of the doctoral programme. Crises may then be experienced due to overload and extended duration of the doctorate.

Cluster 5 -The Overloaded
Only a small number of respondents (4%) were assigned to Cluster 5. However, this cluster had the highest average frequency of dropout thoughts of all clusters, with a value of 7.6, which corresponds to having dropout thoughts often; these respondents are therefore considered to be at risk of dropout.
It is interesting that the perception of low meaningfulness of the additional tasksthe only difference from Cluster 4 leads to a much higher frequency of dropout thoughts. One might suspect that doctoral students are more likely to feel thwarted by additional tasks that they perceive as adding no value to their doctoral studies. There seems to be a discrepancy between self-perception and external perception: Students have a very high expectation of professional self-efficacy but are assigned tasks that are not very meaningful from their perspective. This can lead to them feeling underestimated in their (self-perceived) performance. They may feel thwarted by the additional tasks and hindered in continuously working on their own research. The combination of experiencing crises with frequent thoughts of dropping out is a factor associated with a high risk for either burnout or dropout.

Cluster 6 -The Insecure
Cluster 6 contains 11% of the total sample. On average, doctoral students in this cluster often thought about dropping out of the doctoral programme. Although their score of 7.2 is slightly lower than that of Cluster 5, this cluster can also be considered an at-risk cluster.
These doctoral students know what crises feel like, and in addition, they have lower expectations of their professional self-efficacy. This combination is associated with a risk level similar to that for Cluster 5, but other variables do not seem to be relevant here. Here, too, we suspect an unfinished search for orientation, which at the time of the questionnaire response manifests itself as doubts as to whether their academic activity or research matches their professional expectations. However, it is also unclear whether the university, as a suitable place of work, can make congruent demands on their competencies. For this reason, we would see this cluster in direct contrast to Cluster 1.

Summary
The cluster assignment criteria clearly show the relationship between experiencing a crisis and thinking about dropping out of the doctoral programme, which was to be expected. However, they also indicate that dropout thoughts do not necessarily have to be accompanied by a crisis experience but that there can also be trade-offs in the matching among ideas and self-assessment of competencies within the academic system. Furthermore, we see that the quality of support at the initial phase of the doctoral programme is crucial. In addition, the individual career aspirations with which students start the doctoral programme or complete the doctorate are significant.

Random Forest
The computed random forest, which contains 500 iterations of decision trees, provides more detailed and reliable information about the underlying indicators of dropout thoughts compared to a single decision tree. The random forest contains 18 variables that were previously used in the decision tree either as direct split variables or as surrogate splits. With this particular random forest algorithm, missing data cannot be replaced, so the number of observations decreased to 299. In Figure 2, the top 12 variables are plotted in order of minimum depth. The plot of minimum depth shows the average level of each item in which it was used across the trees. For example, the item satisfaction with supervision was used first at the 1.7th level within a decision treeaveraged across 500 trees. This chart allows a detailed look at the importance of each variable, as lower levels increase the predictive power of each item. Figure 2. Random forest: minimum depth plot, frequency of dropout thoughts ISSN 1927-6044 E-ISSN 1927 First, it is notable that only four of the five variables that distinguished the clusters in the decision tree are relevantthe variable "research topic of professorship" is missing. Professional self-efficacy expectation is in position 2, crisis experience in position 4, satisfaction with supervision of the scientific topic in position 5, and usefulness of the event organization in position 12. In the following, we describe the other variables that are important to the frequency of dropout thoughts. In the description, we distinguish between individual factors of the doctoral student and institutional, organizational, or structural factors, among which we include supervision by the professor.
The top four positions in the minimum depth plot are occupied by variables that address individual factors and institutional factors equally. General satisfaction with supervision, as the most important variable, evaluates the interaction between the individual expectation of the doctoral student and the supervision by the professor/advisor (as part of the university organization). In our view, this not only refers to an individual level but also includes the doctoral conditions in the evaluation. Although the professional self-efficacy expectation (2) can be seen as a characteristic of personality, we assume that institutional conditions also influence the outcome. This, however, happens in an indirect way: if structural and organizational demands require a certain work mode or "academic" habitus, then people might perceive themselves as not self-efficacious in this respect (i.e., job-related). In position 3, we find the factor of instrumentality, which refers to a more agentic characteristic of the gender role self-concept. Again, this variable is positioned at the individual level. Apart from the individual level, there is also an implicit structural level, namely, that of the demands placed on professionalizing scientists: pursuing one's own career is recognized in the academic world as single-mindedness and is a prerequisite for a successful career. In position 4 is the experience of crisis, which in our view indicates a lack of fit between structural requirements and individual performance (ability).
Satisfaction with the professor's support in developing the research question (5) marks as a temporal dimension the initial phase of the doctorate as significant and fragile. Although at first glance this variable seems to be an individual one, we see it as a feature of the structural level where institutional influence plays an important role, namely, how a "junior scientist" is introduced into the academic discipline. The factor individuality (6) characterizes a personal motive of the doctoral student, namely, the possible realization of self, personal growth and development through work activities. One could speak here of the ideal type of scientist who, driven by intrinsic motives, curiosity, and the urge to research, puts his or her life into the service of science (see Weber's "Science as a Profession", 1919). If the motive is highly pronounced, the work itself would be rewarding; a delimiting mode of working would be difficult to recognize as such and even more difficult to counter. The influence of the previous duration of the doctorate on the frequency of thoughts of dropping out is indirectly expressed in the variable of how often the topic of the doctoral thesis has been presented at the institute so far (7). This variable addresses a time factor for the second time (since most institutes offer doctoral colloquia in which each doctoral student can/should present his or her thesis at least once per term). Even though a correlation between increasing doctoral duration and frequency of dropout thoughts is plausible, it does not appear directly in the study group (e.g., duration of previous doctorate or expected doctoral duration) but is only mediated by the two time variables. That is, the mere duration of the doctorate is not an indicator; rather, it is the conditions and adjustments to them. Perceived social support (8) is more a feature of the individual and his or her social (private) environment and less a feature of institutional structures. Equally individual, but on the other end of the support continuum, the experience of stress due to competition among colleagues (9) is an important variable associated with dropout thoughts. Although this variable is also about experiencing stress, variable 10 points to an organizational problem because this stress is experienced due to work tasks that cannot be planned. The relationships of dropout thoughts with social support as well as both "stress variables" are surprising, as they do not concern any discussion of content-related issues but rather address communicative and organizational conditions exclusively. Among all the variables, general satisfaction with the doctorate to date (11) is also an important indicator. This is not surprising since this variable marks the other end of the satisfaction continuum, i.e., it is opposed to (thoughts of) dropping out. Finally, in this group of indicators, the meaningfulness of the additional task of event organization (without reference to seminars) influences the probability of dropout thoughts. We also see this variable as a combination of individual assessment and institutional framing. If doctoral students are not transparently co-informed about the benefits of event organization for their own careers, this task may be more likely to overwhelm them and thus increase the likelihood of dropout thoughts.

Discussion
As we have shown, individual characteristics or traits are not the only important correlates of the dropout thoughts of doctoral students. Rather, the match between individual ideas with organizational and structural requirements seems to be most prominent among the variables examined. Surprisingly, the previous or even expected duration of the doctorate does not play a role; a temporal component has only an indirect effect. Despite computer science being a rather technical subject, gender does not show any influence on the frequency of dropout thoughts, as the results of Bostwick and Weinberg (2018) might suggest. Rather, the indicators point to a lack of or inadequate or insufficient guidance/support, not only at the level of individual support but especially in terms of support at the institutional level. This confirms the findings of Pifer and Baker (2014).
The results of this study are not without limitations. First, the overall number of cases has not reached the goal, set within this project, of 5% of the basic population. This is due to a relatively low response rate. Therefore, the number of cases used in decision trees and random forests is comparably low but still within specifications. Next, the results of the decision tree are highly dependent on the exact constellation of the sample. Therefore, a random forest was calculated, which validates the results of the decision tree. However, there is a limitation here as well: The sample used for the random forest does not equal the sample used for the decision tree since observations with missing values are excluded from the calculation. Theoretically, a systematic absence of observations can lead to bias in the results, but this was not found in our studies. Finally, the survey took place only in the German higher education system, the application of the results to other countries is therefore only possible to a very limited extent.

Recommendations
Based on these findings, we identify interventions that may reduce the incidence of dropout thoughts and therefore possible subsequent doctoral dropout. Although there are a variety of different recommendations to doctoral students at the behavioural level to support promotion and satisfaction, we believe that recommendations at the relational level are equally important. These relate to systemic and structural components that may have delayed and indirect effects but are more promising in creating better conditions for doctoral students. For this reason, we refer only to the structural level in the following recommendations.

Attention to the Different Doctoral Phases
As we noted in Clusters 1 and 2, the early phase of the doctoral process is particularly important, as it is characterized by orientation and adaptation to the scientific habitus. Special attention regarding supervision should be given to this phase. Here, the balance between supervision and the autonomy of doctoral students is very important. However, it is equally important to communicate the doctoral student's expectations of the supervision as well as the scientific work and the advisor's requirements. One way to improve this discussion between both sides can be a written agreement between advisors and doctoral students that is worked out together and contains questions about time management, working methods, expectations, and responsiveness that are as detailed as possible.
In addition to external control of the supervision process, it is absolutely necessary to communicate the structure of the doctoral process to the doctoral student in a transparent way. This includes the task of dividing the entire process into parts and concrete work tasks and planning this as concretely as possible (e.g., by using a timeline). It is important to create a solid exposé and to enable reliable planning by a student for his or her time, with a transparent distribution of tasks and expected responsibilities throughout the process. Even though the beginning of the PhD should be supervised with special attention, in our view, the phases of researching, publishing results and writing the whole thesis also require good supervision but at a level adapted to the student's needs. This means supervision to support publishing (e.g., which conferences are important, how to formulate an abstract), to introduce the student to scientific networks, to support the writing process (e.g., dealing with motivation, writer's block), and to actively refer the student to external support.

Relationship between the Doctoral Student and Advisor
Since advisors play a very important role in the progress of PhD students and in their experience of success, we recommend highlighting the relationship between advisors and PhD students. This includes a basic recognition of the mentor function and the influence of the advisor on the doctoral student. The development of doctoral students to become scientists and colleagues must be ensured. This concerns the socialization process, which, in addition to all content and research issues, underpins the scientific qualifications at all points. Thus, regular meetings of the doctoral student with the advisor, regular reports of the candidates on their progress and the support of additional mentors are essential. In addition, a good advisor-advisee relationship includes enabling autonomous work and publishing. To improve advisors' knowledge and behaviour and to support them in this responsible role, Barnes and Austin (2009, p.312) recommend that faculty "establish codes or guidelines for doctoral advising and mentoring relationships that include explaining institutional guidelines for faculty-advisor relationships, articulating advisors' responsibilities for supporting students in a timely manner, and offering examples of the various ways in which effective and responsible advising can occur across the range of disciplines".
Conversations between the doctoral student and the advisor, the regularity of which should be recorded externally (e.g., through a specific supervision agreement, see above), should primarily serve to answer doctoral students' questions regarding their qualifying work. This will inevitably include questions about motivation or other crises. Thus, the advisor's leadership skills should also include conflict resolution, motivation, and communication skills. Since this relationship often encompasses several levelsthe professor can be an advisor, supervisor and colleague all in onethe risk of a relationship of dependency is particularly high. This can and should certainly be subject to external control, e.g., regular evaluation by a doctoral office.

Support of the Supervisor
Since professional support is important to students in the doctoral process, recommendations could also be derived that relate to support for supervisory responsibilities. Therefore, we go a step further at the institutional level and suggest that, in addition to the research and teaching competence of professors, their leadership competence should also be included as a relevant criterion in an appointment procedure. This means, however, that competencies such as communication, conflict resolution, motivation, and leadership could be components of continuing education for prospective professors.

Selection of Doctoral Candidates
The selection of potential doctoral candidates should be targeted. Not only the formal achievements of the applicants but also the goals that a student is pursuing with the doctorate should play a role. This touches on the question of whether thoughts of dropping out of the PhD should be avoided at all costs. Thus far, we have assumed in the recommendations that these may lead to a lower risk of premature discontinuation of the PhD (without receiving the degree). However, in some cases, we believe that it is advisable to consider discontinuing the doctoral programme and to be able to communicate this transparently and openly in the supervisory relationship. In some cases, this can be done without jeopardizing one's professional career. In computer science in particular, there are numerous career opportunities where a doctorate is not necessary. However, a doctorate is mandatory for an academic career. Addressing the goals of doctoral students, preferably before they begin their doctorate, can save them much time and frustration if they find that academia, with all its responsibilities, is not a good fit for them. This observation closes the gap with respect to the recommendation of transparency described above.

Cooperation
In addition to the responsibility of advisors, it is recommended that doctoral students work and pursue their doctorates in a learning environment that fosters their competence and social skills, especially through research in a working group, joint publications, and social interactions. This means that doctoral students' research activities are preferably conducted in groups. Likewise, it can be recommended to designate another contact person (often a postdoc), in addition to the supervising person, who is available on a collegial level for questions related to content or even formal issues.

Conclusions
Even if at first glance the considerations about dropping out of a doctoral programme seem to more individual-level factors, our results show that the institutional framework is equally important. The point here is how we could and should offer supportive conditions for a successful doctorate to capable doctoral students who are both interested in research and committed to teaching. Thus, the following conclusions are not meant to make a doctorate possible for every interested student but rather to promote success for students who are capable. The conclusions focus on the target group of students who have been selected as having high potential in the research area.
Since role models are of great importance for the development of a professional self-image, we believe that advisors should be educated and supported within their respective faculties in such a way that they can in turn ensure the professional supervision of doctoral students. In addition, Barnes and Austin (2009) recommend that advisors discuss with doctoral students not only research and intellectual issues, such as aspects of their thesis, research, and teaching, but also aspects of career development, particularly after the doctorate. This means that the role of the advisor may include multiple expectations that go beyond the professional supervision of the doctorate. This is even more true if doctoral students expect support from their faculty to escape social isolation, particularly at the early stage of their doctorate (Janta, Lugosi, & Brown, 2014). The responsible persons in the faculties and universities who are in charge of doctoral programmes may initiate discussions about the expectations and requirements of advisors' roles and thus contribute to self-reflective processes on the part of the advisors and the advisee. However, this indicates above all the need to better structure the doctoral process and to not only communicate explicit requirements and expectations (such as publications, teaching obligations, organizational and administrative tasks) but also make public the hidden curriculum, i.e., the unwritten, implicit requirements for successful scientists, such as networking and communication abilities. At the same time, advisees should understand that it is their role and obligation to ask questions and to "request feedback about their progress, abilities, and weaknesses" (Barnes & Austin, 2009, p. 312). Overall, the advisor-advisee relationship is a key factor in the process of successfully achieving a doctorate. This can be confirmed by our results.