Zhao, G. and Lu, L., 2018. Efficient cleaning method of low quality marine data in large data environment. In: Liu, Z.L. and Mi, C. (eds.), Advances in Sustainable Port and Ocean Engineering. Journal of Coastal Research, Special Issue No. 83, pp. 679–684. Coconut Creek (Florida), I SSN 0749-0208.
To optimize the cleaning of low marine quality data, it needs to get density value near each marine data quality sample. It measures samples gathered area to complete the optimization of marine feature data extraction. Traditional methods form the original transaction data set. It discovers the distribution rules of marine data, but it neglects to show the data samples gathering area. So there is low cleaning efficiency. Based on time series, big data environment, and improved Canopy algorithm, it demonstrates low marine quality data optimal cleaning method. The method is to use time series model to identify marine data quantity. It classifies the marine features by time series. It uses high density clustering method to get the density value near each marine data quality sample. It shows gathered areas of samples. It introduces tag speed to the process of adaptive adjustment of sliding window to optimize the cleaning. Experimental simulations show that the proposed method of low marine quality data cleaning is highly efficiency. It guarantees the marine data quality of big data environment.