# Analysis of Zendesk Tickets

In this analysis we’ll look at Zendesk tickets that have come in for Publish.

## Data Collection

We’ll gather the Zendesk tickets from the dbt_buffer.zendesk_tickets model with the query below.

# define query
sql <- "
select distinct
t.id
, date(t.created_at) as created_at
, t.subject
, t.description as text
, t.product
, t.topic
, t.feature
, t.social_account
from dbt_buffer.zendesk_tickets t
where t.product = 'publish'
"

# query BQ
tickets <- dbGetQuery(con, sql)

# filter out tickets with old dates
tickets <- tickets %>%
filter(created_at != "1970-01-01")

# save data
saveRDS(tickets, "publish_zendesk_tickets.rds")

## Data Exploration

We have over 34 thousand tickets for Publish from October 2019 to August 2020. Let’s use the skimr package to get a brief summary of the dataset.

# skim dataset
skim(tickets)
 Name tickets Number of rows 31924 Number of columns 8 _______________________ Column type frequency: character 7 Date 1 ________________________ Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
id 0 1.00 5 5 0 24440 0
subject 0 1.00 1 255 0 16294 0
text 0 1.00 2 24189 0 24203 0
product 0 1.00 7 7 0 1 0
topic 530 0.98 5 19 0 10 0
feature 10726 0.66 2 32 0 36 0
social_account 1091 0.97 7 14 0 9 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
created_at 0 1 2019-10-23 2020-08-04 2020-04-28 264

The complete_rate in particular is interesting for the topic, feature, and social_account columns. It is understandable that these values are not always present.

## Number of Tickets Over Time

Now let’s plot the number of tickets we’ve gotten over time. We’ll count the number of tickets received each week.

Let’s break this down further by looking at the individual topics.

It’s a bit hard to see the individual topics when they’re all clumped together, so let’s organize them in a different way.

This is more clear. It shows us that payment and technical issues make up the majority of the tickets for Publish. We can summarise the median number of tickets received by topic with the plot below.

## Technical Issues

Let’s first look at the proportion of Publish tickets that are due to technical issues each month.

Now let’s take a deeper look at the issues that are driving the largest proportion of the ticket volume.

Social accounts, publishing, and composing/scheduling are the features account for most of the technical issue tickets.

Let’s look at the top 10 features and view their trends over time.

It looks like the trends are fairly consistent over time, apart from a couple anomalous spikes. Now let’s look at the social accounts that are causing the most tickets for the social_accounts and publishing features.

Unsurprisingly, Facebook, Instagram, and LinkedIn are the main culprits. We also want to see how many issues that are coming in about technical issues that are not related to connecting social accounts.

There are around 140 tickets coming in each week on average. Let’s look at some of the individual tickets.

# show individual tickets
tickets %>%
filter(week >= "2020-01-01" & topic == "tech_issues" &
(feature == "publish_publishing" |
feature == "publish_social_accounts")) %>%
select(subject, text) %>%
head(10) 
## # A tibble: 10 x 2
##    subject                                 text
##    <chr>                                   <chr>
##  1 "Compression"                           "Hi, I posted a couple posts on IG (…
##  2 "Unable to post to instagram"           "I have tried numerous times to post…
##  3 "image not showing"                     "I used your system to do a post on …
## 10 "Update Failed to Post"                 "I did the \"retry\" twice (as sugge…