Python High Quality [extra Quality] — Estadistica Practica Para Ciencia De Datos Y

No puedes observar a todos los clientes del mundo. Usas una muestra. Pero, ¿cómo de confiable es tu estimación?

Si buscas elevar la calidad de tus análisis, el libro Estadística Práctica para Ciencia de Datos con R y Python de Peter Bruce, Andrew Bruce y Peter Gedeck es, en mi opinión, la "navaja suiza" que todo profesional debería tener a mano.

lunch = df[df['time'] == 'Lunch']['tip'] dinner = df[df['time'] == 'Dinner']['tip'] stats.ttest_ind(lunch, dinner, equal_var=False) # Welch’s No puedes observar a todos los clientes del mundo

tiempos = [120, 122, 119, 121, 123, 118, 220] # El 220 parece outlier

Using regression models to estimate outcomes, detect anomalies, and understand relationships between variables. Classification: Si buscas elevar la calidad de tus análisis,

stats.mannwhitneyu(lunch, dinner, alternative='two-sided')

| ✅ Do | ❌ Don’t | |------|---------| | Always visualize before testing | Trust p-values blindly | | Report effect size + CI, not just p | Ignore multiple comparisons | | Check assumptions (normality, equal variance) | Remove outliers without justification | | Use non-parametric tests if assumptions fail | Confuse statistical significance with practical importance | | Set significance level before seeing data | Cherry-pick variables in regression | | Use bootstrap for complex estimators | Forget to document random seeds | 15] grupo_B = [14

# Tomamos una muestra aleatoria muestra = df['ingreso'].sample(n=100, random_state=42)

import statsmodels.api as sm # Modelo de regresión lineal X = df[['precio', 'publicidad']] y = df['ventas'] X = sm.add_constant(X) # Añadir intersección modelo = sm.OLS(y, X).fit() print(modelo.summary()) Use code with caution. 5. Diseño de Experimentos (Pruebas A/B)

from scipy import stats # Grupo A (Diseño web viejo) y Grupo B (Diseño web nuevo) grupo_A = [12, 15, 14, 11, 13, 12, 15] grupo_B = [14, 17, 16, 15, 18, 14, 16] # Ejecutar prueba t t_stat, p_val = stats.ttest_ind(grupo_A, grupo_B) print(f"Valor p: p_val") if p_val < 0.05: print("El nuevo diseño web es significativamente mejor.") else: print("No hay cambios reales entre los diseños.") Use code with caution. 4. Regresión y Predicción

We only use our own and third party cookies to improve the quality of your browsing experience, to deliver personalised content, to process statistics, to provide you with advertising in line with your preferences and to facilitate your social networking experience. By clicking accept, you consent to the use of these cookies.

Privacy Settings saved!
Settings

When you visit a website, it may store or retrieve information on your browser, mainly in the form of cookies. Check your personal cookie services here.


Used to detect whether the visitor has accepted the marketing category in the cookie banner. This cookie is required for GDPR compliance of the website. Type: HTTP Cookie / Deadline: 2 years
  • Google

Reject all Services
Accept all Services