External data sharing - How a Fortune 100 bank reduced time-to-data by 70% with synthetic data - YouTube

Channel: unknown

[0]
[Music]
[6]
at large enterprises
[7]
one thing that's definitely day-to-day
[9]
business is vendor validation and
[11]
startup collaborations
[13]
one of our client of fortune 100 banks
[16]
needs to evaluate
[17]
1 000 plus vendors and startups each and
[19]
every year
[20]
and as you can imagine to do that in our
[23]
digital times you need either get these
[26]
vendors or startups into your corporate
[28]
organization or you need to externally
[30]
share the data
[31]
and this is something that takes ages
[34]
and involves many many people which of
[36]
course
[36]
leads to significant costs thanks to
[39]
synthetic data
[40]
our client was able to reduce this time
[42]
to share data by
[44]
70 which ultimately resulted in 10
[47]
millions in cost saving
[48]
each and every year andreas how did this
[51]
fortune 100 bank approach this topic
[53]
what did most degenerate and synthetic
[55]
data enable them to do
[57]
yeah so one challenge as you already
[59]
mentioned was that
[60]
it's highly sensitive data of need to
[62]
use to evaluate startups
[64]
every large corporation wants to be
[66]
innovative and invite as many startups
[68]
into the ecosystem as possible
[70]
but then the approval processes can take
[73]
up to six months even and they said they
[75]
need to find a way
[77]
how they can evaluate startups by
[78]
providing highly realistic and high
[80]
quality
[81]
data since there are some data
[84]
protections in place
[85]
and we fully support them we needed to
[87]
find a way how we can accelerate the
[90]
process
[90]
without compromising on the data quality
[93]
absolutely because with
[94]
these traditional immunization
[96]
techniques that were used in the past
[97]
what we hear from clients is
[99]
that once you went through this lengthy
[101]
process of internal negotiations and
[103]
finally getting access to the data
[105]
it's so heavily anonymized that it is
[107]
not particularly useful
[109]
and that the external vendors or
[110]
startups can't really do something
[113]
meaningful with this data so synthetic
[115]
data really helped them
[116]
to find a gdpr and ccpa compliant way of
[119]
get
[120]
of getting realistic data on the one
[122]
hand out to these startups
[123]
but i think they also build an internal
[125]
i think sandbox environment
[127]
what was the reason behind that yeah so
[130]
so one challenge was of course that
[131]
um they needed to scale this out so it
[134]
means it's not just one or two startups
[135]
they wanted to participate in and in the
[137]
synthetic
[138]
data analysis process but they wanted to
[141]
have this
[142]
as a machine to evaluate startups at
[145]
scale so
[145]
the moment a new startup comes up they
[148]
want to have this
[149]
environment ready also with the
[151]
synthetic data they want to use for
[153]
testing so that they can speed up not
[154]
only the data access but also the whole
[156]
evaluation process
[157]
absolutely and i think it's not this
[159]
like boring old sandbox that that
[161]
organizations know from the past where
[163]
you still have the issue of not having
[164]
realistic data
[166]
but they had really had fresh super
[168]
granular super representative synthetic
[170]
data from different departments and
[171]
different data sets of the organization
[174]
so various different startups and
[175]
vendors could really make use of this
[177]
data immediately
[178]
to test the products and help the bank
[181]
figure out whether this vendor would be
[182]
of
[183]
benefit for them or not but they didn't
[185]
stop with vendor validation what else
[187]
was an exciting use case for our
[188]
synthetic data technology for them
[190]
yeah so so we observed now based on the
[192]
many conversations we have with our
[194]
clients that
[195]
cloud is now that thing so it's a
[196]
normality and every
[198]
prospect wants to utilize all the cloud
[201]
technologies that are already available
[203]
but as you know sensitive data um
[207]
doesn't end up in the cloud for good
[209]
reasons so they need to find a way how
[211]
they can
[212]
utilize the power of the cloud tools
[214]
without compromising on the privacy
[216]
aspect
[217]
and that's why they see synthetic data
[219]
here as a great um
[220]
opportunity to have highly realistic
[223]
test data for
[225]
or or synthetic data for aiml training
[228]
um
[228]
to utilize it in the cloud and then to
[230]
run tests in this environment which
[233]
wouldn't be possible without this um
[235]
highly private
[236]
synthetic data absolutely because
[238]
especially when you do ai and ml
[240]
training you're not interested in the
[242]
privacy sensitive parts of the data
[243]
you're not interested in the individual
[245]
but you really want to get the patterns
[247]
the insights and this is something
[248]
that's perfectly preserved with the
[250]
synthetic data that most degenerate
[252]
produces
[253]
and i think this is definitely also
[255]
something that will be more and more
[256]
important for many organizations we all
[258]
know how difficult it is to get access
[260]
to ai talent
[262]
and the more your organization your
[264]
workforce can rely on these
[266]
out-of-the-box solutions that big cloud
[268]
providers
[270]
provide i think the faster you can
[272]
accelerate
[273]
ai and really scale it within your
[275]
organization so
[276]
definitely something where we expect to
[278]
see much more moving forward
[280]
yes definitely
[293]
[Music]