The building violations data will be filtered to include
Building/Use type only, in particular
illegal conversion of non residential buildings/units to residential without obtaining approval from the New York City Department of Buildings (excluding illegal vehicle storage, for example). For more information regarding illegal conversion:
nyc.govAnalysis
The analysis will include descriptive statistics to better understand the data and its distribution, including bar plots for all zip codes or per county. Also, mapping of the data will be included to observe the spatial distributions of the variables. Local autocorrelation test will be conducted to detect spatial clustering. Then, Linear regression model will be fitted to define the correlation between the variables, with the number of permits as the independent variable. Statistical tests and goodness of fit tests will be included to assess the models. In the case of significantly weak linear correlation a polynomial model might be conducted.
Moreover, there will be outliers detected, such as very high income zip codes, assuming that those are excluded from urban renewal processes and that their number of housing violations are extremely low
Deliverable
The deliverable of this research will be a statistical conclusion regarding the question and hypothesis, and a better sense about the best way to detect housing violations.
References
This project is an expansion of my
former work that had been done as a part of
Civic Analytics course, taught by Prof. C.E. Kontokosta, NYU CUSP Fall 2017.