Recreational Marijuana Data Integrity Verification

Recreational Marijuana Data Integrity Verification

Paul Andreas Fischer

1/19/2017

 

Recreational Marijuana Data Integrity Verification




The purpose of this effort to maintain cybersecurity will be to identify trends in community activity, usage, and coding on three webpages. This will be measured in a weighted per capita membership inclusive of adjustment for last month marijuana usage according to statistics of diagnoses of marijuana use dependency in the last year sourced through the National Institute of Health, statistical significance testing for key HTML or python coding which may be present on the pages ranging from common use terms such as margin and italic/bold to design oriented coding such as getOptions and fix, and finally using a theoretical W3C validation scheme borrowed from a co-operative effort by employees of Drop-box, Google, and Mozilla.

An effort throughout will be made to avoid redundancies in data and to reduce reliance on contingent terminology in order to establish statistical significance in further analysis. While this will not be used to justify any legal actions or hold significant ramifications for user, community, or legislative individuals or groups due to the hypothetical nature of the theories of security fundamental to the arguments provided, the data may be accessed and used publicly and reproduced. As with all research looking at differences in data, change over time will be critical to determining whether this is an appropriate sequence to validate the integrity of the media distributed. This data is accessed legally under the Digital Millennium Copyright Act as encryption research to enhance secure methods of encryption technologies (section 1201(g)), to measure and protect personal privacy (section 1201(i)), and security testing should readers wish to check their own computer, computer system, or computer network (section 1201(j)).






Data results


Per capita membership or following:




Conservative estimate,

Vermont Community, Vermont Collection, Colorado Community

.085%, .075%, .6%


Keyword search statistical significance test from raw test raw data with high-end tail comparison




“input” – Vermont Community, Vermont Collection, Colorado Community

“_” –  8335, 8933, 8816

“”” – 10,916, 11,181, 10,670

“head” – 48, 49, 49

“fix” – 12, 12, 15

“array” – 108, 110, 108

“marijuana” – 41, 57, 21

“cannabis” – 37, 29, 4

“meta” – 22, 21, 22

“content” – 240, 252, 259

“function” – 2264, 2256, 2264

“getOptions” – 3, 3, 3

“window” – 420, 417, 420

“element” – 125, 125, 125

“null” – 5226, 5489, 4582

“try” – 155, 159, 158

“dump” – 53, 54, 53

“exec” – 51, 51, 51

“recreational“ – 16, 24, 17

“google” – 183, 181, 205

“script” – 79, 75, 79

“true” – 151, 160, 127

“$” – 284, 284, 284


Theoretical Validation Scheme


This scheme has been chosen because it allows authentication of not only the server, which is standard in such attempts, but also the content which has been posted to the respective pages. With a variety of pegged user handles and other contributions involved in the creation and maintenance of a controlled substance which is limited in distribution to those over a certain age, and in some locations can be met with severe legal repercussions, there is an intrinsic value to vetting cyber information.

In order to avoid such misunderstandings, a thorough read through of all source data was initiated and completed with the following results. Rather than sifting through user data or implementing a cryptographic hash system recommended in the theoretical scheme, evaluating the cross-origin data leakage to identify reconnaissance activities by potential or real attackers was initiated (SRI, 5.3). This is both non-intrusive, experimental, and potentially more accurate for the purposes of identification than traditional methods of code evaluation.


Discussion of data, results and conclusions


Cannabis Use Rates and Trends




The growth in Colorado is under 30% in recent years, marking a substantially larger presence than has been found in Vermont. Statistical analysis found ratios of 1:1.25:1.5 from .2% of the total populations of the respective states in the Vermont Collection, Vermont Community, and Colorado Community that were admitted to the hospital for potential marijuana dependent symptoms according to a recent update from the White House which cites data from 2010. Multiplied by a cohort with an average life expectancy of 78 years, this data could encompass almost 60% of current marijuana users. This could also comprise the entirety of the population have used before entering High School, according to a report released by the UN in 2014.

A statistically interesting point that is not addressed in this paper is that the trend for admission for marijuana related episodes shows a dramatic variance in those two populations, as the number of marijuana users per capita was about 30-40% higher in Colorado at the time. Potential explanations include the presence of higher potency marijuana in Vermont due to lack of effective regulations during the transition period of decriminalization. Another could be adulterants such as lead which decrease the flashpoint at which a joint or a bowl is lit, increasing the temperature at which smoked material is absorbed or the popularity of edibles which may be more potent than a smoked product, as there are nearly a third more tobacco users in Colorado per capita than Vermont.


Keyword search with statistical significance analysis




A perfect match was reached in four of 22 source code searches, nearly 20% of the total results. Taken as an outlier result, this demonstrates definite significance. Two of three quantitative forms found a perfect match in exactly half of the searches. The natural odds of these events occurring are one over 4.2 times 100,000,000. This determines that there is a high probability of interactivity occurring between these web-based pages.

While there are no prohibitions between communications of two online communities, this can also serve as a template to verify that such communication is not occurring between any of the communities and communities tailored towards those who underage. It is also a possible indicator of a malware presence, which could include a BOTNET, synthetic code injector algorithm, sniffing agents or most likely a combination of all of the above. In order for any confirmed statements to be made with only a statistical analysis, causal proof of intent of harm or defamation and malware cyber-activity must be demonstrated. To accomplish this, an experimental form of subresource integrity is being modified and taken advantage of, referred to above as cross-origin data leakage.


DIV and Cross-Origin Data Leakage




The initial read-through looked great, though there is a major qualification which ought to be addressed present one time only in all three communities that likely represents a violation of amendments to the CFAA in 1984. Due to recent legislation and expansion of that act, these consequences could be quite serious if not administratively addressed and the responsible posts promptly deleted, though no legal responsibilities exist unless the display represents an extension or whole of a small business. Colorado had significantly greater evidence of hash use, but all three communities/collection presented enough to provide a strong sense of security. Cross-origin analysis demonstrated that “content-originated” was indeed activated upon execution of the HTML. No flag presented or evidence of any tampering of any kind.

Further analysis of the entirety of the source code, around 30 solid pages for each community, revealed the presence of a flag which discontinues the cross-origin protections and which could allow a JSON style attack, gaining access to passwords or other confidential credentials. This should allow a violation of the “same-origin” policy and may have been used to determine what content is present within the cross-origin resource. Whether this setting is coded on or off, the threat level is ultimately low.


Valuation of any Potential Threat to the Pages




A discretionary valuation of a low threat level had to be ultimately determined as users accessing the site are still protected by Google security and terms of use, i.e. dedication to privacy outlined above in this document and codified in recent US law for cyberspace, as well as amendment to the Constitution. The presence of an “Anonymous” omission of cross-origin protections is present at one time in the HTML code of all three websites. That does not indicate that the tool has been exploited. Unless there is an experimental lab underway through Google, the only data which should be accessible in the event of a general breach throughout the company would be the user names and profiles of individuals who are on the pages.

However, the possibility that it is a “wait and see” placement should be treated with caution as well as corrective measures should be taken to eliminate the offensive code from the pages. It is worth mentioning, once again, that the only parties which have any liabilities for such a piece of code are those who posted the sequence and any small business owners involved with the pages that may have turned a blind-eye or aided the malware. Possible legal explanations which could indicate that one did not know about updates to the CFAA or if the code had been prepared before 1984 may be possible, but even if it is the case, does not mean that the threat or potential threat should not be snuffed immediately.




References:


Braun, F., Akhawe, D., Weinberger, J., & West, M.. Subresource integrity. W3C working draft. (2014).

The Digital Millennium Copyrights Act. 17 U.S.C. § 512 (1998).

Kesteren, A. van. Cross-Origin Resource Sharing (URL: http://www.w3.org/TR/access-control/). W3C (2014).

United Nations Office on Drugs and Crime (UNODC). Recent statistics and trend analysis of the illicit drug market. (2014).

W3C Recommendation. HTML5, A vocabulary and associated APIs for HTML and XHTML, W3 (2014).

Skip to toolbar