Friday, 29 March 2013

Assignment#10

Problem 1:-
To create 3 vectors, x, y, z and choose any random values for them, ensuring they are of equal length,
T<- cbind(x,y,z)
Create 3 dimensional plot of the same (all 3 types)
Solution:-
Commands:-
> sample<-rnorm(50,25,6)
> x<-sample(sample,10)
> y<-sample(sample,10)
> z<-sample(sample,10)
> T<-cbind(x,y,z)
Screenshots:-
Image
> plot3d(T)
Image
plot3d(T,col=rainbow(1000))
Image
plot3d(T,col=rainbow(1000),type=’s')
Image


Problem 2:-
Read the documentation of rnorm and pnorm,
Create 2 random variables
Create 3 plots:
1. X-Y
2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories) Hint: ?factor
3. Color code and draw the graph
4. Smooth and best fit line for the curve
Solution:-
Commands:-
> x<-rnorm(200,mean=5,sd=1)
> y<-rnorm(200,mean=3,sd=1)
> z1<-sample(letters,5)
> z2<-sample(z1,200,replace=TRUE)
> z<-as.factor(z2)
> t<-cbind(x,y,z)

Screenshots:-
Image
> qplot(x,y)
Image
> qplot(x,z,alpha=I(2/10))
Image
> qplot(x,z)

Image
> qplot(x,y,geom=c(“point”,”smooth”))
Image
> qplot(x,y,colour=z)
Image
> qplot(log(x),log(y),colour=z)
Image

Sunday, 24 March 2013

Assignment# 9

Assignment # 9
Node XL

NodeXL is a social network analysis and visualization software tool.It offers microsoft excel spreadsheet software.It allow us to import data from various social networking site such as twitter, facebook,youtube etc
Help us to group/cluster the imported data depending upon our requirements.
Network analysis help us to know people who are following us/a particular website,presence of key people in network group,relationships, details of people connecting with you or with theirs friends etc.
NodelXL has a modular architecture that allows for the up, analysis and visualize network data.
Set the color, shape, size, label, and opacity of individual vertices by filling in worksheet cells, or let NodeXL do it for you based on vertex attributes such as degree, betweenness centrality or PageRank.

                                                                Twitter



                                                                   Facebook


Friday, 15 March 2013

Assignment#8




Assignment 8 : Panel Data Analysis
Do Panel Data Analysis of "Produc" data analyzing  on three types of model :
      Pooled affect model
      Fixed affect model
      Random affect model

Determine which model is the best by using functions:
       pFtest : Fixed vs Pooled
       plmtest : Pooled vs Random
       phtest: Random vs Fixed
The data can be loaded using the following command
data(Produc , package ="plm")
head(Produc)











Pooled Affect Model

pool <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=("pooling"),index =c("state","year"))
summary(pool)








Fixed Affect Model:
fixed<-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=("within"),index =c("state","year"))
summary(fixed)

Random Affect Model:
random <-plm( log(pcap) ~log(hwy)+ log(water)+ log(util) + log(pc) + log(gsp) + log(emp) + log(unemp), data=Produc,model=("random"),index =c("state","year"))
> summary(random)



Testing of Model
This can be done through Hypothesis testing between the models as follows:
H0: Null Hypothesis: the individual index and time based params are all zero
H1: Alternate Hypothesis: atleast one of the index and time based params is non zero
Pooled vs Fixed
Null Hypothesis: Pooled Affect Model
Alternate Hypothesis : Fixed Affect Model
Command:
> pFtest(fixed,pool)
Result:
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
F = 56.6361, df1 = 47, df2 = 761, p-value < 2.2e-16
alternative hypothesis: significant effects
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.
Pooled vs Random
Null Hypothesis: Pooled Affect Model
Alternate Hypothesis: Random Affect Model
Command :
> plmtest(pool)
Result:
  Lagrange Multiplier Test - (Honda)
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
normal = 57.1686, p-value < 2.2e-16
alternative hypothesis: significant effects
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Random Affect Model.
Random vs Fixed
Null Hypothesis: No Correlation . Random Affect Model
Alternate Hypothesis: Fixed Affect Model
Command:
 > phtest(fixed,random)
Result:
 Hausman Test
data:  log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
chisq = 93.546, df = 7, p-value < 2.2e-16
alternative hypothesis: one model is inconsistent
Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Affect Model.
Conclusion: 
So after making all the tests we come to the conclusion that Fixed Affect Model is best suited to do the panel data analysis for "Produc" data set.
Hence , we conclude that within the same id i.e. within same "state" there is no variation.