[RESUMO] In clinical trials, mixed models are becoming more popular for the analysis of longitudinal data. The main motivation is often expected dropout which can easily be handled through the analysis of the longitudinal trajectories. In many situations, analyses are corrected for baseline covariates such as study site or stratification variables. Key questions are then how to perform a longitudinal analysis correcting for baseline covariates, and how sensitive are the results with respect to choices made and models used ?
In this presentation, we will first present and compare a number of techniques available to correct for baseline covariates within the context of the linear mixed model for continuous outcomes. Second, we will study the sensitivity of the various techniques in case the baseline correction is based on a wrong model or does not include important covariates. Finally, our findings will be used to formulate some general guidelines relevant in a clinical trial context. All findings and results will be illustrated extensively using real and simulated data.
[RESUMO] Machine learning first appeared in computer science research in the 1950s. It´s is a method of data analysis based in the idea that systems can learn with data, identify patterns and make decisions with minimum human intervention. The machine learnings methods became so popular nowadays due to the improvement of data storage and data processing. Some of the most used machine learning methods in the different areas of knowlegde include the k-Nearest Neighbors algorithm, naive Bayes, decision trees, artificial neural networks, association rules, linear regression, logistic regression among others. The machine learning models are trained on existing data by running large amounts of data through the model until it finds enough patterns to be able to make accurate decisions about that data, and the trained model is then used to score new data to make predictions. Some applications of these models include fraud detection, online advertising, pattern and image recognition, prediction of equipment failures, web search results and spam filtering. The aim of this conference is show the origins and applications of machine learning and how data can be transformed into knowledge and action.
[RESUMO] In this talk, I deal with the problem of perfoming dimensionality reduction and variable selection with Principal Component Analysis (PCA), where the object of inference is the weights matrix (i.e., the matrix of coefficients used to compute the components scores). While methods for the loadings matrix are already well-established in the literature, techniques for the weights have not been crystallized yet. To achieve regularization, I propose using a Bayesian specification of the model with Inverse-Gamma priors on the variances of the weights. To achieve variable selection, I specify a PCA version of Stochastic-Search-Variable-Selection (SSVS), a form of Spike-and-Slab prior. The model is then estimated via Variational Inference. A simulation study shows an improvement of the model in retrieving the correct structure of the weights matrix, over the state-of-the-art Sparse PCA. Last, Bayesian PCA is applied to a genetic dataset in order to test its performance on real-world data.
[RESUMO] In medical research, the variables of interest are usually the moment of occurrence of an event, such as clinical diagnosis, cure of a disease or death; and/or an unobserved latent process measured repeatedly over time, as the indicator of the progression of a certain disease. In some applications, both the time until an event of interest and the longitudinal observations are available. In these cases, joint modeling is required, since a separate analysis may lead to inefficient or biased results. From the 90s onwards, several authors have proposed joint models for follow-up data, where typically survival studies incorporate the effect of an endogenous time-dependent covariate measured with error, and longitudinal analysis corrects the nonrandom dropout with a time-to-event model. In this talk, we introduce the main mathematical formulation for a joint model of longitudinal and survival data and discuss its standard estimation approaches. In addition, we present a novel two-stage estimation methodology based on simulation-extrapolation techniques. Synthetic and real data are used to compare the accuracy and computational time of each approach.